multiprocessing / forking memory usage

R

Randall Smith

I'm trying to get a grasp on how memory usage is affected when forking
as the multiprocessing module does. I've got a program with a parent
process using wx and other memory intensive modules. It spawns child
processes (by forking) that should be very lean (no wx required, etc).
Based on inspection using "ps v" and psutil, the memory usage (rss) is
much higher than I would expect for the subprocess.

My understanding is that COW is used when forking (on Linux). So maybe
"ps v pid" is reflecting that. If that's the case, is there a way to
better determine the child's memory usage? If it's not the case and I'm
using modules I don't need, how can I reduce the memory usage to what
the child actually uses instead of including everything the parent is using?

Randall
 
P

Piet van Oostrum

Randall Smith said:
RS> I'm trying to get a grasp on how memory usage is affected when forking as
RS> the multiprocessing module does. I've got a program with a parent process
RS> using wx and other memory intensive modules. It spawns child processes (by
RS> forking) that should be very lean (no wx required, etc). Based on
RS> inspection using "ps v" and psutil, the memory usage (rss) is much higher
RS> than I would expect for the subprocess.

The child is a clone of the parent. So both its virtual memory usage and
its resident memory usage will be equal to the parent's ones immediately
after the fork(). But the actual physical memory has only one copy
resident, although ps will show it on both processes (at least I think
that's how ps works). Of course later they will diverge.
RS> My understanding is that COW is used when forking (on Linux).

I think this is true of all modern Unix systems.
RS> So maybe "ps v pid" is reflecting that. If that's the case, is
RS> there a way to better determine the child's memory usage?

Define `memory usage' in the light of the above.

As long as the parent is still around and you don't run out of virtual
memory in the child, not much harm is done.

If the parent stops and you don't run out of virtual memory in the
child, the excessive pages will eventually be paged out, and then no
longer occupy physical memory. As long as you have enough swap space it
shouldn't be a big problem. The extra paging activity is a bit of a
loss, however.

If you run out of virtual memory in the child you have a problem, however.
RS> If it's not the case and I'm using
RS> modules I don't need, how can I reduce the memory usage to what the child
RS> actually uses instead of including everything the parent is using?

The best would be to fork the child before you import the excess modules
in the parent. If that is not possible you could try to delete as much
in the child as you can, for example by
del wx; del sys.modules['wx'] etc, delete all variables that you don't
need, and hope the garbage collector will clean up enough. But it will
make you application quite complicated. From the python level you can't
get rid of loaded shared libraries, however. And trying to do that from
the C level is probably close to committing suicide.

My advise: don't worry until you really experience memory problems.
 
R

Randall Smith

Thanks Piet. You gave a good explanation and I think I understand much
better now.
RS> I'm trying to get a grasp on how memory usage is affected when forking as
RS> the multiprocessing module does. I've got a program with a parent process
RS> using wx and other memory intensive modules. It spawns child processes (by
RS> forking) that should be very lean (no wx required, etc). Based on
RS> inspection using "ps v" and psutil, the memory usage (rss) is much higher
RS> than I would expect for the subprocess.

The child is a clone of the parent. So both its virtual memory usage and
its resident memory usage will be equal to the parent's ones immediately
after the fork(). But the actual physical memory has only one copy
resident, although ps will show it on both processes (at least I think
that's how ps works). Of course later they will diverge.
RS> My understanding is that COW is used when forking (on Linux).

I think this is true of all modern Unix systems.
RS> So maybe "ps v pid" is reflecting that. If that's the case, is
RS> there a way to better determine the child's memory usage?

Define `memory usage' in the light of the above.

As long as the parent is still around and you don't run out of virtual
memory in the child, not much harm is done.

If the parent stops and you don't run out of virtual memory in the
child, the excessive pages will eventually be paged out, and then no
longer occupy physical memory. As long as you have enough swap space it
shouldn't be a big problem. The extra paging activity is a bit of a
loss, however.

If you run out of virtual memory in the child you have a problem, however.
RS> If it's not the case and I'm using
RS> modules I don't need, how can I reduce the memory usage to what the child
RS> actually uses instead of including everything the parent is using?

The best would be to fork the child before you import the excess modules
in the parent. If that is not possible you could try to delete as much
in the child as you can, for example by
del wx; del sys.modules['wx'] etc, delete all variables that you don't
need, and hope the garbage collector will clean up enough. But it will
make you application quite complicated. From the python level you can't
get rid of loaded shared libraries, however. And trying to do that from
the C level is probably close to committing suicide.

My advise: don't worry until you really experience memory problems.
 
A

Aahz

I'm trying to get a grasp on how memory usage is affected when forking
as the multiprocessing module does. I've got a program with a parent
process using wx and other memory intensive modules. It spawns child
processes (by forking) that should be very lean (no wx required, etc).
Based on inspection using "ps v" and psutil, the memory usage (rss) is
much higher than I would expect for the subprocess.

One option if you're concerned about memory usage is to os.exec() another
program after forking, which will overlay the current process.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top