Perl takes a lot of memory when you just require a file

Discussion in 'Perl Misc' started by RJ, Feb 15, 2007.

  1. RJ

    RJ Guest

    I am decompiling some data from my C code which can be used by Perl
    programs later on. My Data structure is something like follows -

    struct tDs{
    char *tName;
    int *data;
    }

    struct DS{
    int index;
    char *rName;
    <list> tDs;
    }

    I am decompiling above data in a perl file (which I generate from C
    code) as follows

    <generated_file.pl>
    pass_data_from_perl_to_c (index1,rName,{"tName1" => "data1" , "tName2"
    => "data2"});


    The last argument in above function call is a hasg referemce to list
    of values associated with various objects of type tDs for index
    index1.

    First of all, if I just do a `require "<generated_file.pl>" ` it takes
    a lot of memory (around 4Mb for 2 Mb file even if I do just a return
    after entering pass_data_from_perl_to_c and populate no Data).
    If I do populate data in form of 3-D array is perl memory requirement
    is 5 times than expected.
    Can someone Please explain me why this is so and how I can avoid
    spending unnecessary memory taken by require.

    Waiting for a quick reply.
    -RJ
     
    RJ, Feb 15, 2007
    #1
    1. Advertising

  2. RJ

    -berlin.de Guest

    RJ <> wrote in comp.lang.perl.misc:
    > I am decompiling some data from my C code which can be used by Perl
    > programs later on. My Data structure is something like follows -
    >
    > struct tDs{
    > char *tName;
    > int *data;
    > }
    >
    > struct DS{
    > int index;
    > char *rName;
    > <list> tDs;
    > }


    What's to decompile? The lines above are (pseudo-) C code.

    > I am decompiling above data in a perl file (which I generate from C
    > code) as follows
    >
    > <generated_file.pl>
    > pass_data_from_perl_to_c (index1,rName,{"tName1" => "data1" , "tName2"
    > => "data2"});


    Does that mean the generated file contains the line "pass_data_...",
    or does it mean the "pass_data_..." function generates the file?

    > The last argument in above function call is a hasg referemce to list
    > of values associated with various objects of type tDs for index
    > index1.


    A hash reference isn't a list.

    > First of all, if I just do a `require "<generated_file.pl>" ` it takes
    > a lot of memory (around 4Mb for 2 Mb file even if I do just a return
    > after entering pass_data_from_perl_to_c and populate no Data).
    > If I do populate data in form of 3-D array is perl memory requirement
    > is 5 times than expected.


    Perl often takes more memory than expected. Adjust your expectations.

    > Can someone Please explain me why this is so and how I can avoid
    > spending unnecessary memory taken by require.


    Since we have not the slightest idea what the generated file contains,
    there's no way we can explain its behavior.

    > Waiting for a quick reply.


    Quick? You're talking to unpaid volunteers.

    Anno
     
    -berlin.de, Feb 16, 2007
    #2
    1. Advertising

  3. RJ

    RJ Guest

    On Feb 16, 4:38 pm, -berlin.de wrote:
    > RJ <> wrote in comp.lang.perl.misc:
    >
    > > I am decompiling some data from my C code which can be used by Perl
    > > programs later on. My Data structure is something like follows -

    >
    > > struct tDs{
    > > char *tName;
    > > int *data;
    > > }

    >
    > > struct DS{
    > > int index;
    > > char *rName;
    > > <list> tDs;
    > > }

    >
    > What's to decompile? The lines above are (pseudo-) C code.
    >

    I generate a lot of Data from C code. Later I have an GUI interface
    written in TK/perl from where I need to access the Data generated
    previously. One way was to write data in Ascii format and then do a
    parsing in perl. I have followed a different approach.
    'pass_data_from_perl_to_c' is basically a function implemented in
    perl. When, I do a 'require <generated_file.pl>' from perl code, this
    function gets called and then I populate the data (index1,rName,...)
    passed to this function in perl Data structures. So actually these are
    the arguments passed to function pass_data_from_perl_to_c.
    > > I am decompiling above data in a perl file (which I generate from C
    > > code) as follows

    >
    > > <generated_file.pl>
    > > pass_data_from_perl_to_c (index1,rName,{"tName1" => "data1" , "tName2"
    > > => "data2"});

    >
    > Does that mean the generated file contains the line "pass_data_...",
    > or does it mean the "pass_data_..." function generates the file?
    >

    I think I have explained it above
    > > The last argument in above function call is a hasg referemce to list
    > > of values associated with various objects of type tDs for index
    > > index1.

    >
    > A hash reference isn't a list.

    I mean here that keys of hash corresponds to the list of values which
    I had in my C code.
    >
    > > First of all, if I just do a `require "<generated_file.pl>" ` it takes
    > > a lot of memory (around 4Mb for 2 Mb file even if I do just a return
    > > after entering pass_data_from_perl_to_c and populate no Data).
    > > If I do populate data in form of 3-D array is perl memory requirement
    > > is 5 times than expected.

    >
    > Perl often takes more memory than expected. Adjust your expectations.
    >

    My main concern here is that even if return from very beggining of
    function
    pass_data_from_perl_to_c , even then perl takes a lot of memory in
    just requiring
    file '<generated_file.pl>' while I am populating no data structures.
    Is there anyway to avoid that (or some way to execute the function
    calls in ,'<generated_file.pl>' infile without loading file in
    memory), since there can be case , when I have to 'require'
    this file but I would need not populate single information from here.
    To give just an example,
    in requiring such a file of about 100MB perl takes 800MB when no data
    is getting populated.
    Is that due to hash refrences being passed to function
    pass_data_from_perl_to_c ...
    > > Can someone Please explain me why this is so and how I can avoid
    > > spending unnecessary memory taken by require.

    >
    > Since we have not the slightest idea what the generated file contains,
    > there's no way we can explain its behavior.
    >

    I hope now I am a bit clear about the generated file.
    > > Waiting for a quick reply.

    >
    > Quick? You're talking to unpaid volunteers.
    >

    I know that , but I am in one of the most tight situations. So , can
    you please help me out.
    > Anno
     
    RJ, Feb 18, 2007
    #3
  4. On 2007-02-18 09:12, RJ <> wrote:
    > On Feb 16, 4:38 pm, -berlin.de wrote:
    >> RJ <> wrote in comp.lang.perl.misc:
    >> > First of all, if I just do a `require "<generated_file.pl>" ` it takes
    >> > a lot of memory (around 4Mb for 2 Mb file even if I do just a return
    >> > after entering pass_data_from_perl_to_c and populate no Data).
    >> > If I do populate data in form of 3-D array is perl memory requirement
    >> > is 5 times than expected.

    >>
    >> Perl often takes more memory than expected. Adjust your expectations.
    >>

    > My main concern here is that even if return from very beggining of
    > function pass_data_from_perl_to_c , even then perl takes a lot of
    > memory in just requiring file '<generated_file.pl>' while I am
    > populating no data structures.


    I don't understand what you expect to happen when you "just require" the
    file. When you require a file, it is compiled and the compiled code is
    stored in memory. Any data embedded in the code is of course compiled
    (converted to perl data structures) and stored, too.


    > Is there anyway to avoid that (or some way to execute the function
    > calls in ,'<generated_file.pl>' infile without loading file in
    > memory), since there can be case , when I have to 'require'
    > this file but I would need not populate single information from here.


    Separate the data from the code. Perl is good for reading and writing
    files - use it!

    hp


    --
    _ | Peter J. Holzer | Es ist ganz einfach ihn zu verstehen, wenn
    |_|_) | Sysadmin WSR | man nur alle wichtigen Worte im Satz durch
    | | | | andere ersetzt.
    __/ | http://www.hjp.at/ | -- Nils Ketelsen in danr
     
    Peter J. Holzer, Feb 18, 2007
    #4
  5. RJ

    RJ Guest

    On Feb 18, 6:34 pm, "Peter J. Holzer" <> wrote:
    > On 2007-02-18 09:12, RJ <> wrote:
    >
    > > On Feb 16, 4:38 pm, -berlin.de wrote:
    > >> RJ <> wrote in comp.lang.perl.misc:
    > >> > First of all, if I just do a `require"<generated_file.pl>" ` it takes
    > >> > a lot of memory (around 4Mb for 2 Mb file even if I do just a return
    > >> > after entering pass_data_from_perl_to_c and populate no Data).
    > >> > If I do populate data in form of 3-D array is perl memory requirement
    > >> > is 5 times than expected.

    >
    > >> Perl often takes more memory than expected. Adjust your expectations.

    >
    > > My main concern here is that even if return from very beggining of
    > > function pass_data_from_perl_to_c , even then perl takes a lot of
    > > memory in just requiring file '<generated_file.pl>' while I am
    > > populating no data structures.

    >
    > I don't understand what you expect to happen when you "justrequire" the
    > file. When yourequirea file, it is compiled and the compiled code is
    > stored in memory. Any data embedded in the code is of course compiled
    > (converted to perl data structures) and stored, too.
    >

    I just want to clarify one thing over here. There are only function
    calls in these generated
    perl file. It looks something like follows -

    <Prototype>
    <spyDecompileTagData(index,rName,{"t1" => "1","t2" =>
    ["100","200"],"t3" => "200"});>

    <Example snippet>
    ======================================================================================
    spyDecompileTagData(537,"",{"INCR" => "1","tag1" =>
    ["100","200"],"tag0" => "200"});
    spyDecompileTagData(538,"",{"INCR" => "1","tag2" =>
    "tag2.value2","tag3" => "default"});
    spyDecompileTagData(539,"",{"INCR" => "1","tag4" =>
    ["tag4.value1","tag4.value3"]});
    spyDecompileTagData(540,"",{"INCR" => "1","tag1" => ["200"]});
    spyDecompileTagData(541,"",{"INCR" => "1","tag1" => ["200"]});
    spyDecompileTagData(542,"",{"INCR" => "1","tag4" => ["default"]});
    spyDecompileTagData(543,"",{"INCR" => "1","tag4" => ["tag4.value3"]});
    spyDecompileTagData(544,"",{"INCR" => "1"});
    spyDecompileTagData(545,"",{"INCR" => "1","tag1" => ["200"]});
    spyDecompileTagData(546,"",{"INCR" => "1"});
    spyDecompileTagData(547,"",{"INCR" => "1","tag1" => ["200"]});
    spyDecompileTagData(548,"",{"INCR" => "1"});
    spyDecompileTagData(549,"",{"INCR" => "1"});
    spyDecompileTagData(550,"",{"INCR" => "1","tag4" => ["tag4.value1"]});
    spyDecompileTagData(551,"",{"INCR" => "1","STATUS" => "FIXED"});
    spyDecompileTagData(552,"",{"INCR" => "1","STATUS" => "TOFIX"});
    spyDecompileTagData(553,"",{"INCR" => "1","STATUS" => "ANALYZE"});
    spyDecompileTagData(554,"",{"INCR" => "1","tag9" => "1","tag8" =>
    "3.14","tag7" => "a"});
    spyDecompileTagData(555,"",{"INCR" => "1","tag11" => "2","tag12" =>
    "1","tag9" => "2","tag8" => "9.8","tag0" => "0","tag7" => "c"});
    ======================================================================================
    There is no other things in this perl file other than these function
    calls.
    Now if I just make a return from inside 'spyDecompileTagData' after
    doing 3 shift stmts (one for each argument passed to this function),
    still perl takes a lot of memory. I have used above format just to
    avoid parsing as I you can see the values passed in 3rd argument can
    be quite complex (a hash whose values can be scalar/array refrence or
    even a hash refrence. I don't want perl to store
    the whole file in code section but I want to compile code inline. Is
    there anyway to do so.

    > > Is there anyway to avoid that (or some way to execute the function
    > > calls in ,'<generated_file.pl>' infile without loading file in
    > > memory), since there can be case , when I have to 'require'
    > > this file but I would need not populate single information from here.

    >
    > Separate the data from the code. Perl is good for reading and writing
    > files - use it!
    >
    > hp
    >
    > --
    > _ | Peter J. Holzer | Es ist ganz einfach ihn zu verstehen, wenn
    > |_|_) | Sysadmin WSR | man nur alle wichtigen Worte im Satz durch
    > | | | | andere ersetzt.
    > __/ |http://www.hjp.at/| -- Nils Ketelsen in danr
     
    RJ, Feb 19, 2007
    #5
  6. On 2007-02-19 06:24, RJ <> wrote:
    > On Feb 18, 6:34 pm, "Peter J. Holzer" <> wrote:
    >> On 2007-02-18 09:12, RJ <> wrote:
    >> > On Feb 16, 4:38 pm, -berlin.de wrote:
    >> >> RJ <> wrote in comp.lang.perl.misc:
    >> >> > First of all, if I just do a `require"<generated_file.pl>" ` it takes
    >> >> > a lot of memory (around 4Mb for 2 Mb file even if I do just a return
    >> >> > after entering pass_data_from_perl_to_c and populate no Data).

    [...]
    >> > My main concern here is that even if return from very beggining of
    >> > function pass_data_from_perl_to_c , even then perl takes a lot of
    >> > memory in just requiring file '<generated_file.pl>' while I am
    >> > populating no data structures.

    >>
    >> I don't understand what you expect to happen when you "just require" the
    >> file. When you require a file, it is compiled and the compiled code is
    >> stored in memory. Any data embedded in the code is of course compiled
    >> (converted to perl data structures) and stored, too.
    >>

    > I just want to clarify one thing over here. There are only function
    > calls in these generated
    > perl file. It looks something like follows -
    >
    ><Prototype>
    ><spyDecompileTagData(index,rName,{"t1" => "1","t2" =>
    > ["100","200"],"t3" => "200"});>
    >
    ><Example snippet>
    >======================================================================================
    > spyDecompileTagData(537,"",{"INCR" => "1","tag1" =>
    > ["100","200"],"tag0" => "200"});
    > spyDecompileTagData(538,"",{"INCR" => "1","tag2" =>
    > "tag2.value2","tag3" => "default"});
    > spyDecompileTagData(539,"",{"INCR" => "1","tag4" =>
    > ["tag4.value1","tag4.value3"]});
    > spyDecompileTagData(540,"",{"INCR" => "1","tag1" => ["200"]});
    > spyDecompileTagData(541,"",{"INCR" => "1","tag1" => ["200"]});
    > spyDecompileTagData(542,"",{"INCR" => "1","tag4" => ["default"]});
    > spyDecompileTagData(543,"",{"INCR" => "1","tag4" => ["tag4.value3"]});
    > spyDecompileTagData(544,"",{"INCR" => "1"});
    > spyDecompileTagData(545,"",{"INCR" => "1","tag1" => ["200"]});
    > spyDecompileTagData(546,"",{"INCR" => "1"});
    > spyDecompileTagData(547,"",{"INCR" => "1","tag1" => ["200"]});
    > spyDecompileTagData(548,"",{"INCR" => "1"});
    > spyDecompileTagData(549,"",{"INCR" => "1"});
    > spyDecompileTagData(550,"",{"INCR" => "1","tag4" => ["tag4.value1"]});
    > spyDecompileTagData(551,"",{"INCR" => "1","STATUS" => "FIXED"});
    > spyDecompileTagData(552,"",{"INCR" => "1","STATUS" => "TOFIX"});
    > spyDecompileTagData(553,"",{"INCR" => "1","STATUS" => "ANALYZE"});
    > spyDecompileTagData(554,"",{"INCR" => "1","tag9" => "1","tag8" =>
    > "3.14","tag7" => "a"});
    > spyDecompileTagData(555,"",{"INCR" => "1","tag11" => "2","tag12" =>
    > "1","tag9" => "2","tag8" => "9.8","tag0" => "0","tag7" => "c"});
    >======================================================================================
    > There is no other things in this perl file other than these function
    > calls.


    Several 10,000's of them, if your files are several megabytes long.

    So you have several 10,000 anonymous hashes, each with a few members,
    some of which are anonyous arrays. Plus an equal number of strings and
    numbers. Plus the code to call spyDecompileTagData with these
    arguments, of course. All of this will be stored in memory after the
    require.

    > Now if I just make a return from inside 'spyDecompileTagData' after
    > doing 3 shift stmts (one for each argument passed to this function),
    > still perl takes a lot of memory.


    Even if you don't even call the code at all, it will take a lot of
    memory. You have compiled it, so you now have it in memory.

    For example:
    ------------------------------------------------------------------------
    #!/usr/bin/perl
    use warnings;
    use strict;


    my $sub = shift;
    my $n = shift;

    sub create {
    open (my $fh, ">", "foo.pl");
    print $fh qq{sub spyDecompileTagData {}\n};
    if ($sub) {
    print $fh "sub f {\n";
    }
    for my $i (1 .. $n) {
    print $fh qq{spyDecompileTagData($i, '', { "INCR" => "1", "tag4" => ["tag4.value3"]});\n}
    }
    if ($sub) {
    print $fh "}";
    }
    print $fh "1;";
    close($fh);
    }

    sub vmsize {
    open (my $fh, "<", "/proc/$$/status");
    while (<$fh>) {
    print if (/^VmSize:/);
    }
    }

    create();
    vmsize();

    require 'foo.pl';
    vmsize();

    if ($sub) {
    f();
    vmsize();
    }
    ------------------------------------------------------------------------

    This creates a file similar to the files you have (although all the
    lines are the same), oprionally encapsulates in a sub. Now if I run
    this:

    % ./foo 0 100000
    VmSize: 5056 kB
    VmSize: 108356 kB

    the memory consumption will increase by roughly 100 MB when the code is
    required. If the created code is encapsulated in a sub:

    % ./foo 1 100000
    VmSize: 5056 kB
    VmSize: 111484 kB
    VmSize: 111484 kB

    the memory consumption will grow even a little more with the require,
    but actually calling the code doesn't make a difference.

    Oh, and that's roughly 1 kB per line. If I simply create an array with
    100000 elements with the same data, it only takes half as much memory,
    so you can probably save quite a lot of memory if you parse the file
    instead of requiring it.


    > I have used above format just to avoid parsing as I you can see the
    > values passed in 3rd argument can be quite complex (a hash whose
    > values can be scalar/array refrence or even a hash refrence.


    You are trading convenience against memory. You can certainly do that,
    your time is probably more expensive than RAM.

    However, there are modules for reading and writing such complex data
    structures, for example YAML and Storable. I suggest you take a look at
    them.


    > I don't want perl to store the whole file in code section
    > but I want to compile code inline. Is there anyway to do so.


    If by "compiling code inline" you mean "compile each line only just
    before it is executed", then no, there is no way to do that.

    You can only avoid compiling code if you, er, don't compile it. So for
    example you could create many small files instead of one big file and
    require only the one's you need. Or if you can decide for each line
    whether you need it, you could read the file and eval the lines you
    need.

    hp


    --
    _ | Peter J. Holzer | Es ist ganz einfach ihn zu verstehen, wenn
    |_|_) | Sysadmin WSR | man nur alle wichtigen Worte im Satz durch
    | | | | andere ersetzt.
    __/ | http://www.hjp.at/ | -- Nils Ketelsen in danr
     
    Peter J. Holzer, Feb 19, 2007
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Gary
    Replies:
    1
    Views:
    328
    Janaka
    Oct 16, 2003
  2. =?Utf-8?B?UmFuIERhdmlkb3ZpdHo=?=

    aspnet_wp takes a lot of memory even after appdomain restarted

    =?Utf-8?B?UmFuIERhdmlkb3ZpdHo=?=, Mar 29, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    417
    =?Utf-8?B?UmFuIERhdmlkb3ZpdHo=?=
    Mar 29, 2005
  3. homecurr

    Vector takes a lot memory

    homecurr, Nov 25, 2004, in forum: Java
    Replies:
    5
    Views:
    427
    Thomas G. Marshall
    Nov 26, 2004
  4. Raga
    Replies:
    4
    Views:
    1,720
  5. possibilitybox
    Replies:
    12
    Views:
    578
    Kent Johnson
    Apr 30, 2005
Loading...

Share This Page