Perl takes a lot of memory when you just require a file

R

RJ

I am decompiling some data from my C code which can be used by Perl
programs later on. My Data structure is something like follows -

struct tDs{
char *tName;
int *data;
}

struct DS{
int index;
char *rName;
<list> tDs;
}

I am decompiling above data in a perl file (which I generate from C
code) as follows

<generated_file.pl>
pass_data_from_perl_to_c (index1,rName,{"tName1" => "data1" , "tName2"
=> "data2"});


The last argument in above function call is a hasg referemce to list
of values associated with various objects of type tDs for index
index1.

First of all, if I just do a `require "<generated_file.pl>" ` it takes
a lot of memory (around 4Mb for 2 Mb file even if I do just a return
after entering pass_data_from_perl_to_c and populate no Data).
If I do populate data in form of 3-D array is perl memory requirement
is 5 times than expected.
Can someone Please explain me why this is so and how I can avoid
spending unnecessary memory taken by require.

Waiting for a quick reply.
-RJ
 
A

anno4000

RJ said:
I am decompiling some data from my C code which can be used by Perl
programs later on. My Data structure is something like follows -

struct tDs{
char *tName;
int *data;
}

struct DS{
int index;
char *rName;
<list> tDs;
}

What's to decompile? The lines above are (pseudo-) C code.
I am decompiling above data in a perl file (which I generate from C
code) as follows

<generated_file.pl>
pass_data_from_perl_to_c (index1,rName,{"tName1" => "data1" , "tName2"
=> "data2"});

Does that mean the generated file contains the line "pass_data_...",
or does it mean the "pass_data_..." function generates the file?
The last argument in above function call is a hasg referemce to list
of values associated with various objects of type tDs for index
index1.

A hash reference isn't a list.
First of all, if I just do a `require "<generated_file.pl>" ` it takes
a lot of memory (around 4Mb for 2 Mb file even if I do just a return
after entering pass_data_from_perl_to_c and populate no Data).
If I do populate data in form of 3-D array is perl memory requirement
is 5 times than expected.

Perl often takes more memory than expected. Adjust your expectations.
Can someone Please explain me why this is so and how I can avoid
spending unnecessary memory taken by require.

Since we have not the slightest idea what the generated file contains,
there's no way we can explain its behavior.
Waiting for a quick reply.

Quick? You're talking to unpaid volunteers.

Anno
 
R

RJ

What's to decompile? The lines above are (pseudo-) C code.
I generate a lot of Data from C code. Later I have an GUI interface
written in TK/perl from where I need to access the Data generated
previously. One way was to write data in Ascii format and then do a
parsing in perl. I have followed a different approach.
'pass_data_from_perl_to_c' is basically a function implemented in
perl. When, I do a 'require <generated_file.pl>' from perl code, this
function gets called and then I populate the data (index1,rName,...)
passed to this function in perl Data structures. So actually these are
the arguments passed to function pass_data_from_perl_to_c.
Does that mean the generated file contains the line "pass_data_...",
or does it mean the "pass_data_..." function generates the file?
I think I have explained it above
A hash reference isn't a list.
I mean here that keys of hash corresponds to the list of values which
I had in my C code.
Perl often takes more memory than expected. Adjust your expectations.
My main concern here is that even if return from very beggining of
function
pass_data_from_perl_to_c , even then perl takes a lot of memory in
just requiring
file '<generated_file.pl>' while I am populating no data structures.
Is there anyway to avoid that (or some way to execute the function
calls in ,'<generated_file.pl>' infile without loading file in
memory), since there can be case , when I have to 'require'
this file but I would need not populate single information from here.
To give just an example,
in requiring such a file of about 100MB perl takes 800MB when no data
is getting populated.
Is that due to hash refrences being passed to function
pass_data_from_perl_to_c ...
Since we have not the slightest idea what the generated file contains,
there's no way we can explain its behavior.
I hope now I am a bit clear about the generated file.
Quick? You're talking to unpaid volunteers.
I know that , but I am in one of the most tight situations. So , can
you please help me out.
 
P

Peter J. Holzer

My main concern here is that even if return from very beggining of
function pass_data_from_perl_to_c , even then perl takes a lot of
memory in just requiring file '<generated_file.pl>' while I am
populating no data structures.

I don't understand what you expect to happen when you "just require" the
file. When you require a file, it is compiled and the compiled code is
stored in memory. Any data embedded in the code is of course compiled
(converted to perl data structures) and stored, too.

Is there anyway to avoid that (or some way to execute the function
calls in ,'<generated_file.pl>' infile without loading file in
memory), since there can be case , when I have to 'require'
this file but I would need not populate single information from here.

Separate the data from the code. Perl is good for reading and writing
files - use it!

hp
 
R

RJ

I don't understand what you expect to happen when you "justrequire" the
file. When yourequirea file, it is compiled and the compiled code is
stored in memory. Any data embedded in the code is of course compiled
(converted to perl data structures) and stored, too.
I just want to clarify one thing over here. There are only function
calls in these generated
perl file. It looks something like follows -

<Prototype>
<spyDecompileTagData(index,rName,{"t1" => "1","t2" =>
["100","200"],"t3" => "200"});>

<Example snippet>
======================================================================================
spyDecompileTagData(537,"",{"INCR" => "1","tag1" =>
["100","200"],"tag0" => "200"});
spyDecompileTagData(538,"",{"INCR" => "1","tag2" =>
"tag2.value2","tag3" => "default"});
spyDecompileTagData(539,"",{"INCR" => "1","tag4" =>
["tag4.value1","tag4.value3"]});
spyDecompileTagData(540,"",{"INCR" => "1","tag1" => ["200"]});
spyDecompileTagData(541,"",{"INCR" => "1","tag1" => ["200"]});
spyDecompileTagData(542,"",{"INCR" => "1","tag4" => ["default"]});
spyDecompileTagData(543,"",{"INCR" => "1","tag4" => ["tag4.value3"]});
spyDecompileTagData(544,"",{"INCR" => "1"});
spyDecompileTagData(545,"",{"INCR" => "1","tag1" => ["200"]});
spyDecompileTagData(546,"",{"INCR" => "1"});
spyDecompileTagData(547,"",{"INCR" => "1","tag1" => ["200"]});
spyDecompileTagData(548,"",{"INCR" => "1"});
spyDecompileTagData(549,"",{"INCR" => "1"});
spyDecompileTagData(550,"",{"INCR" => "1","tag4" => ["tag4.value1"]});
spyDecompileTagData(551,"",{"INCR" => "1","STATUS" => "FIXED"});
spyDecompileTagData(552,"",{"INCR" => "1","STATUS" => "TOFIX"});
spyDecompileTagData(553,"",{"INCR" => "1","STATUS" => "ANALYZE"});
spyDecompileTagData(554,"",{"INCR" => "1","tag9" => "1","tag8" =>
"3.14","tag7" => "a"});
spyDecompileTagData(555,"",{"INCR" => "1","tag11" => "2","tag12" =>
"1","tag9" => "2","tag8" => "9.8","tag0" => "0","tag7" => "c"});
======================================================================================
There is no other things in this perl file other than these function
calls.
Now if I just make a return from inside 'spyDecompileTagData' after
doing 3 shift stmts (one for each argument passed to this function),
still perl takes a lot of memory. I have used above format just to
avoid parsing as I you can see the values passed in 3rd argument can
be quite complex (a hash whose values can be scalar/array refrence or
even a hash refrence. I don't want perl to store
the whole file in code section but I want to compile code inline. Is
there anyway to do so.
 
P

Peter J. Holzer

On Feb 16, 4:38 pm, (e-mail address removed)-berlin.de wrote:
First of all, if I just do a `require"<generated_file.pl>" ` it takes
a lot of memory (around 4Mb for 2 Mb file even if I do just a return
after entering pass_data_from_perl_to_c and populate no Data). [...]
My main concern here is that even if return from very beggining of
function pass_data_from_perl_to_c , even then perl takes a lot of
memory in just requiring file '<generated_file.pl>' while I am
populating no data structures.

I don't understand what you expect to happen when you "just require" the
file. When you require a file, it is compiled and the compiled code is
stored in memory. Any data embedded in the code is of course compiled
(converted to perl data structures) and stored, too.
I just want to clarify one thing over here. There are only function
calls in these generated
perl file. It looks something like follows -

<Prototype>
<spyDecompileTagData(index,rName,{"t1" => "1","t2" =>
["100","200"],"t3" => "200"});>

<Example snippet>
======================================================================================
spyDecompileTagData(537,"",{"INCR" => "1","tag1" =>
["100","200"],"tag0" => "200"});
spyDecompileTagData(538,"",{"INCR" => "1","tag2" =>
"tag2.value2","tag3" => "default"});
spyDecompileTagData(539,"",{"INCR" => "1","tag4" =>
["tag4.value1","tag4.value3"]});
spyDecompileTagData(540,"",{"INCR" => "1","tag1" => ["200"]});
spyDecompileTagData(541,"",{"INCR" => "1","tag1" => ["200"]});
spyDecompileTagData(542,"",{"INCR" => "1","tag4" => ["default"]});
spyDecompileTagData(543,"",{"INCR" => "1","tag4" => ["tag4.value3"]});
spyDecompileTagData(544,"",{"INCR" => "1"});
spyDecompileTagData(545,"",{"INCR" => "1","tag1" => ["200"]});
spyDecompileTagData(546,"",{"INCR" => "1"});
spyDecompileTagData(547,"",{"INCR" => "1","tag1" => ["200"]});
spyDecompileTagData(548,"",{"INCR" => "1"});
spyDecompileTagData(549,"",{"INCR" => "1"});
spyDecompileTagData(550,"",{"INCR" => "1","tag4" => ["tag4.value1"]});
spyDecompileTagData(551,"",{"INCR" => "1","STATUS" => "FIXED"});
spyDecompileTagData(552,"",{"INCR" => "1","STATUS" => "TOFIX"});
spyDecompileTagData(553,"",{"INCR" => "1","STATUS" => "ANALYZE"});
spyDecompileTagData(554,"",{"INCR" => "1","tag9" => "1","tag8" =>
"3.14","tag7" => "a"});
spyDecompileTagData(555,"",{"INCR" => "1","tag11" => "2","tag12" =>
"1","tag9" => "2","tag8" => "9.8","tag0" => "0","tag7" => "c"});
======================================================================================
There is no other things in this perl file other than these function
calls.

Several 10,000's of them, if your files are several megabytes long.

So you have several 10,000 anonymous hashes, each with a few members,
some of which are anonyous arrays. Plus an equal number of strings and
numbers. Plus the code to call spyDecompileTagData with these
arguments, of course. All of this will be stored in memory after the
require.
Now if I just make a return from inside 'spyDecompileTagData' after
doing 3 shift stmts (one for each argument passed to this function),
still perl takes a lot of memory.

Even if you don't even call the code at all, it will take a lot of
memory. You have compiled it, so you now have it in memory.

For example:
------------------------------------------------------------------------
#!/usr/bin/perl
use warnings;
use strict;


my $sub = shift;
my $n = shift;

sub create {
open (my $fh, ">", "foo.pl");
print $fh qq{sub spyDecompileTagData {}\n};
if ($sub) {
print $fh "sub f {\n";
}
for my $i (1 .. $n) {
print $fh qq{spyDecompileTagData($i, '', { "INCR" => "1", "tag4" => ["tag4.value3"]});\n}
}
if ($sub) {
print $fh "}";
}
print $fh "1;";
close($fh);
}

sub vmsize {
open (my $fh, "<", "/proc/$$/status");
while (<$fh>) {
print if (/^VmSize:/);
}
}

create();
vmsize();

require 'foo.pl';
vmsize();

if ($sub) {
f();
vmsize();
}
------------------------------------------------------------------------

This creates a file similar to the files you have (although all the
lines are the same), oprionally encapsulates in a sub. Now if I run
this:

% ./foo 0 100000
VmSize: 5056 kB
VmSize: 108356 kB

the memory consumption will increase by roughly 100 MB when the code is
required. If the created code is encapsulated in a sub:

% ./foo 1 100000
VmSize: 5056 kB
VmSize: 111484 kB
VmSize: 111484 kB

the memory consumption will grow even a little more with the require,
but actually calling the code doesn't make a difference.

Oh, and that's roughly 1 kB per line. If I simply create an array with
100000 elements with the same data, it only takes half as much memory,
so you can probably save quite a lot of memory if you parse the file
instead of requiring it.

I have used above format just to avoid parsing as I you can see the
values passed in 3rd argument can be quite complex (a hash whose
values can be scalar/array refrence or even a hash refrence.

You are trading convenience against memory. You can certainly do that,
your time is probably more expensive than RAM.

However, there are modules for reading and writing such complex data
structures, for example YAML and Storable. I suggest you take a look at
them.

I don't want perl to store the whole file in code section
but I want to compile code inline. Is there anyway to do so.

If by "compiling code inline" you mean "compile each line only just
before it is executed", then no, there is no way to do that.

You can only avoid compiling code if you, er, don't compile it. So for
example you could create many small files instead of one big file and
require only the one's you need. Or if you can decide for each line
whether you need it, you could read the file and eval the lines you
need.

hp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top