Ilya Zakharevich a dit le Mon, 20 Mar 2006 21:28:26 +0000 (UTC):
Thanks for your answer.
So, how do perl developers handle quite commons tasks where one program needs
to use/allocate really big variables periodicaly which then should be released
to let others tasks run
?
Reuse the space yourself, commonly by using a $global to hold the data.
When you are done with the first $data run, set $data = '', and do the
next task, using $data again. The memory from the first use of $data
will be freed internally by Perl, and will be reused, but it "probably"
won't be returned to the system.
So it is common in Perl programs, to see the system memory usage rise to
a peak, of whatever the greatest data size was. Like a peak-meter.
Usually in complex programs, objects are being used. There are various
tricks you learn to reuse the same object over-and-over, and just undef
the data portion of the object. You cannot reliably use the
"object-create-destroy" cycle, without gainig memory. Each object has
it's own technique for clearing out old data.
I'm not a Perl internals type of person, but the developers are trying
to improve Perl's memory conservation. For instance in Perl/Tk, it is
absolutely essential to reuse any object you create. While Perl/Gtk2
has made significant strides in cleaning itself up automatically, so you
can often use the create-destroy method of object handling.
Perl6 has made promises of improvements across the board.
Starting a new process for those jobs which are known to use a lot of memory is
a solution, is there others
?
It is common to use disk-based databases to handle sharing of huge data.
That way you can fork to do whatever you need, and when the forked
process is done, it leaves it's results in the db, and totally returns
the memory to the system.
Another way to share forked data, is thru shared memory segments, but
they are tricky.
Threads will act like objects. If you continually create threads, then
destroy them when done, memory will just climb. You need to reuse
threads, just feeding them different data for each run. Threads are
great for sharing data in realtime, but they will suck system memory if
not carefully handled. Once again, the memory will rise to peak value.
So there is not going to be 1 general purpose answer to your question.
Each program needs to make it's own tradeoffs, concerning size,
performance, ease of sharing data, etc.
But for programs which use 100's of megs of data, you are best off
forking them, and storing results in a common disk-file-db. The Storable
module is often used for this.
Good luck.