limitations of forking on windows

Corey_G · Jun 28, 2004

I wrote some simple code that forks processes and sends http requests
from the children. I have read the perlfork documentation and
understand that Windows uses "fork umulation" (with threads) rather
than actually forking processes (like you can in *nix). I am
wondering if anyone has any experience of the limitations of how many
processes can be forked? I get memory errors at 10 processes (in
comparison, I can fork several hundred on linux with no problems).

in the perlfork doc: "In the eyes of the operating system,
pseudo-processes created via the fork() emulation are simply threads
in the same process. This means that any process-level limits imposed
by the operating system apply to all pseudo-processes taken together.
This includes any limits imposed by the operating system on the number
of open file, directory and socket handles, limits on disk space
usage, limits on memory size, limits on CPU utilization etc."

I don't see how I could be hitting any limits at such a small number
of processes.

[using ActiveState Perl 5.8 on Win2k)

-Corey Goldberg

Sherm Pendley · Jun 28, 2004

Purl said:
For NT5, if a typical machine, you are using close
to one-third upwards to one-half of your system
resources simply running your NT5 kernal and Win GUI.

I haven't used a Windows system in quite some time, but if I recall
correctly, System Resources (in the Windows sense of the phrase) could only
be stored in the first 1MB of memory.

Is that what you're referring to here? If so, I'm mildly surprised - I would
have thought the NT5/2k/XP kernels would have gotten past that. Only mildly
suprised though - it *is* MS we're talking about. ;-)

sherm--

Sherm Pendley · Jun 28, 2004

Purl said:
That is very close to true forking under Windows.

I was aware that forking was emulated with threads under Windows, but not of
so many of the details. Thanks for filling in the gaps.

sherm--

Malcolm Dew-Jones · Jun 28, 2004

Sherm Pendley ([email protected]) wrote:
: Purl Gurl wrote:

: > For NT5, if a typical machine, you are using close
: > to one-third upwards to one-half of your system
: > resources simply running your NT5 kernal and Win GUI.

: I haven't used a Windows system in quite some time, but if I recall
: correctly, System Resources (in the Windows sense of the phrase) could only
: be stored in the first 1MB of memory.

: Is that what you're referring to here? If so, I'm mildly surprised - I would
: have thought the NT5/2k/XP kernels would have gotten past that. Only mildly
: suprised though - it *is* MS we're talking about. ;-)

There are enough reason to complain about windows without inventing more.

Win2K is NT, and NT has absolutely nothing to do with DOS based 1meg
limitations.

Malcolm Dew-Jones · Jun 28, 2004

Purl Gurl ([email protected]) wrote:
: Sherm Pendley wrote:
:
: > Purl Gurl wrote:
:
: > > For NT5, if a typical machine, you are using close
: > > to one-third upwards to one-half of your system
: > > resources simply running your NT5 kernal and Win GUI.
:
: > I haven't used a Windows system in quite some time, but if I recall
: > correctly, System Resources (in the Windows sense of the phrase) could only
: > be stored in the first 1MB of memory.

: You might be thinking of MSDOS lower memory and upper memory,
: LMB and UMB. Gates biggest mistake, decades back, was his decision
: computers would never need more than 640 kilobytes of memory.

DOS was designed (not by Gates) to handle 1 meg, not 640K

It was IBM that placed the hardware at fixed memory locations starting at
the 640K location that created the LMB UMB problems. As for windows 1 meg
system resource memory issues, this was necessary for compatibility with a
lot of software and hardware. The MS solution to these sorts of
limitations was NT, and while NT may have its own problems, you can hardly
say that choosing a different os that does not have the same limitations
is a bad approach to the problem, indeed it is exactly the same strategy
used by everyone who turns to linux and etc to avoid the limitations
within windows.

John Bokma · Jun 28, 2004

Purl said:
There is a Linux machine sitting here, right in there
with our Windows machines, but I cannot do with Linux
what I can do with Windows.

Which tells a lot about you and nothing about those machines.

John Bokma · Jun 28, 2004

Purl said:
Gates made a serious mistake assuming 640 kilobytes
was plenty of memory to run programs.

At that time it was not a mistake. Predicting the future is hard, and
more silly statements have been made.

Corey_G · Jun 29, 2004

It is not the number of forked processes. It is the

amount of system resources used by each forked process.

right.. and resource usage is fine when I get these errors.

Comparing Linux system resource usage and NT5 system
resource usage, is not a valid comparison. Linux is
very minimalist, NT5 is graphics and services rich.

my gnu/linux machine running KDE is just as "graphics and services
rich" as any windows box... I'm not quite sure what you are getting
at.

You are also comparing threading to forking.

I am comparing the identical piece of code on 2 platforms... even
though it is handled at the OS level differently (forked or emulated
forking via threads).

anyways.... the resolution to my problem was to upgrade the version of
ActiveState Perl I am using. I was using 5.8.2.. i just upgraded to
5.8.4 and I no longer get the memory errors. Now it looks like I can
fork until I run out of resources.

-Corey

John Bokma · Jun 29, 2004

Corey_G said:
right.. and resource usage is fine when I get these errors.

my gnu/linux machine running KDE is just as "graphics and services
rich" as any windows box... I'm not quite sure what you are getting
at.

KDE is not Linux.

Bryan Castillo · Jun 30, 2004

For Roberta, I use a classic trick of executing a small
Visual Basic program, from Perl, which returns data to
my Windows clipboard. Later, using a Win32 API module,
my Perl program reads my clipboard, then processes data.

True sequence of events is,

Perl -> dos execution of a visual basic program
VB -> sends to and reads from a binary dictionary database
VB -> data to clipboard
API -> reads clipboard
Perl -> processes clipboard data

Isn't the clipboard a shared resource between processes?
If another program reads or writes to the clipboard, don't
you have the possibility of having the wrong data?
It probably doesn't matter for simple applications.
However, I can't imagine that using the clipboard for
IPC in daemon processes is a good idea.

John Bokma · Jun 30, 2004

Bryan said:
Isn't the clipboard a shared resource between processes?
If another program reads or writes to the clipboard, don't
you have the possibility of having the wrong data?
It probably doesn't matter for simple applications.
However, I can't imagine that using the clipboard for
IPC in daemon processes is a good idea.

Perl is quite good at handling binary, so why the need of VB? Also,
can't you talk to VB using OLE?

Bryan Castillo · Jun 30, 2004

Purl Gurl said:
Bryan Castillo wrote:

(snipped)

"Clipboard" is an easy to remember name for a reserved
memory block. It is actually an adjunct memory block
for a keyboard. As an example, if you highlight text,
use Control C, that data is stored in the "clipboard"
which is truly just a reserved memory block.

This is no different than Perl making multiple uses
of a select memory block, such as with lexical scoping.

Yes. However, only those processes I program to do so,
use the clipboard memory. Our webserver is not a machine
which is accessed and used by "users" which is only our
family. It is a stand-alone server dedicated solely as
a webserver.

Isn't it possible that 2 web-requests can come in at the same time
for the web-server, where 2 instances of your script try to
use the clipboard at the same time, using different parameters?
You must be using a locking mechanism on the clipboard,
perhaps a semaphore? Perhaps the webserver only services one
request at a time.

Otherwords, I make sure there are no conflicts
within my programs. Not so difficult. Really
not much different than use of file locking,
except my methods never fail.

You need to work on your imagination.

This will help you exercise your imagination.

Write a Perl program which will,

execute a Windows binary Websters Dictionary
make it an invisible background process
insert text into its text box using Control V
issue an ENTER command
retrieve results using Control Insert
close the Websters Dictionary binary
return data to your Perl program

You will encounter one very serious challenge.
Websters Dictionary cannot print to Standard Input;
it is a stand-alone binary executable which is
disassociated from your Perl program being a
Windows GUI based program. All returned data
is printed to your console via a typical
Visual Basic GUI window, a process completely
independent of Perl.

So, dazzle me with an imaginative Perl program
which can interface a Windows GUI binary as
I have described.

Why? I would rather look for some other package other than
a Windows GUI binary. I think its fine that you may want
to use the Clipboard for a project that is "for fun".
However, I would not want people encouraged to use
the clipboard in backend processing. It's a hack.

I would try to see if there is a way to extract the
data out of the program and put it into a relational database.
This would probably violate the license agreement for the
software package though. Although, exposing the dictionary
through a web-site may violate it as well.

I don't care what you use in your software, I just wanted
to issue a disclaimer to other people searching for sollutions
out there on the net, to not use the clipboard as a means of
communication for backend processes.

I don't want to argue with you. I just want the point made to those
out there, viewing this thread.

Peter J. Acklam · Jun 30, 2004

Purl Gurl said:
I have played with a variety of locking mechanisms, including
semaphore and current running programs, but found neither are
needed for our low traffic from _legitimate_ clients.

What's a semaphore?

Peter

Peter J. Acklam · Jun 30, 2004

Purl Gurl said:
1: an apparatus for visual signaling (as by the position of one or
more movable arms)
2: a system of visual signaling by two flags held one in each hand.

Now, who's trolling...

I meant "semaphore" in this context, as if you didn't know that.

Peter

Tassilo v. Parseval · Jul 1, 2004

Also sprach Purl Gurl:

A semaphore lock is when you create a file then poll to determine
if that file exists or not.

if (!(-e "lockfile.lck"))
{
open (LOCK, ">lockfile.lck");
close (LOCK);

open (DATAFILE, ">>data_write.log");
print DATAFILE "This is my data";
close (DATAFILE);

unlink ("lockfile.lck");
}

That lockfile.lck is a semaphore lock. While it
exists, data_write is not allowed to be opened.

Inherently, there is a bit of a race condition.

A bit, eh? The above is the classical race condition as found in many
cargo-cult programs. Semaphores only work when accessing them happens
atomicly.

Having said that, the above can easily be made secure and atomic by
using sysopen:

use Fcntl;
...

if (sysopen(my $f, "lockfile.lck", O_CREAT|O_EXCL)) {
open my $data, ">>data_Write.log";
print $data "...";
close $f;
unlink "lockfile.lck";
}

AFAIK, this is reasonably portable and also works on Windows.

More complex coding contains short sleeps so
the lockfile can be polled a number of times
while waiting on another program to finish.

Works very well but is not suggested for extremely
high volume usage, like dozens or more of programs
trying to access the same write file. It is possible
to win the race condition, under very heavy usage
circumstances.

Considering that the above is not even more to type, why not doing it
properly in the first place?

Another problem is if your program crashes before
the unlink of lockfile, you are stuck. No access
can be had from that point forward in time.

That is the problem of most locking mechanisms, except when using
flock() which isn't readily available everywhere. A solution to those
stale lockfiles is to write the PID of the creator into the lockfile. If
a lockfile exists, the current process can read the PID and check
whether the process referenced by it is still alive.

This approach however has its own problems, in particular it requires to
lock the lockfile itself.

Tassilo

Tassilo v. Parseval · Jul 1, 2004

Also sprach Purl Gurl:

Cargo cult? You are displaying bigotry born of ignorance.

It is clear to me, personally, you do not have much
experience working with semaphore locking and are
parroting what others claim, without logical basis.

This race condition. Using an example of the entire process
of creating a semaphore, write to file, close, unlink,
takes a total of fifty milliseconds, which is slow, how
many concurrent processes must be running to win this
race condition, on the average?

Except that it might equally well take 2 seconds when the machine is
under heavier load.

Use of flock, as you know, exhibits a lot of problems. It is
actually no more reliable than is semaphore locking.

Well, I am sure you'd find a way to use flock() improperly so that it
yields as crappy results as your -e/open method. You cannot blame a
system-call for your own inadequaties.

There are no reliable lock methods which are highly portable.

Each specific system needs to be addressed, which can be
done by programmers who are talented, open minded and
do not hold "cargo cult" bigotries, which are an earmark
of less-than-talented programmers.

As you know, in the past, I was the first Perl programmer
to introduce a safe reliable file lock mechanism for Win32
machines, which is very close to one-hundred percent reliable.

If it is only close to one-hundred percent, it's rubbish and therefore
belongs in the bin. Locking mechanisms have to be one-hundred percent
reliable. Period.

The fact that you deny that doesn't make your claims true. It only shows
that you never had any formal education in the field of computer
science.

Not that I would care a lot. It's the integrity of your files that you
put at stake, not mine.

File locking is a troublesome topic. All common methods
exhibit good points and bad points. A good programmer
will use a method best for current circumstances.

Quite. Your methods however are not "best" (or even acceptable) for any
circumstances.

Tassilo

Anno Siegel · Jul 1, 2004

Tassilo v. Parseval said:
Also sprach Purl Gurl:
[...]

Another problem is if your program crashes before
the unlink of lockfile, you are stuck. No access
can be had from that point forward in time.

Click to expand...

That is the problem of most locking mechanisms, except when using
flock() which isn't readily available everywhere. A solution to those
stale lockfiles is to write the PID of the creator into the lockfile. If
a lockfile exists, the current process can read the PID and check
whether the process referenced by it is still alive.

This approach however has its own problems, in particular it requires to
lock the lockfile itself.

Another problem is pid re-use. To be sure that a process with a
certain pid is still the same process, it is necessary to check that
the system wasn't rebooted in the meantime. Storing the boot time
(if available) along with the pid is a possible solution.

Anno

John Bokma · Jul 1, 2004

But I am curious at the ~100% reliable method, maybe it can be decided
if it's actually unreliable or can be made reliable.

Bryan Castillo · Jul 1, 2004

No sense of humor.

A semaphore lock is when you create a file then poll to determine
if that file exists or not.

No, that is your example of a semaphore. There are other ways to
use a semaphore than file locking and polling.

I believe there is a module on Windows for perl called Win32::Semaphore.
There are also the functions semctl, semget and semop available to perl
on most unix operating systems.

John Bokma · Jul 1, 2004

Purl said:
A semaphore lock is when you create a file then poll to determine
if that file exists or not.

if (!(-e "lockfile.lck"))
{

*RACE CONDITION*

open (LOCK, ">lockfile.lck");

Inherently, there is a bit of a race condition.

a bit? HUGE. Don't use this code, ever.

More complex coding contains short sleeps so
the lockfile can be polled a number of times
while waiting on another program to finish.

*BAD CODE*

Asynchronous forking(?) processes on Windows	10	Jul 2, 2007
forking and avoiding zombies!	0	Dec 10, 2012
Forking problem	5	Nov 4, 2007
Copy-on-write when forking a python process	0	Apr 8, 2011
multiprocessing / forking memory usage	3	May 26, 2009
Linux: using "clone3" and "waitid"	0	Oct 17, 2023
Limitations and workarounds to expressions defining size of an array	44	Sep 4, 2012
getting results of multiple simultaneous system commands	4	May 20, 2010

limitations of forking on windows

Corey_G

Sherm Pendley

Sherm Pendley

Malcolm Dew-Jones

Malcolm Dew-Jones

John Bokma

John Bokma

Corey_G

John Bokma

Bryan Castillo

John Bokma

Bryan Castillo

Peter J. Acklam

Peter J. Acklam

Tassilo v. Parseval

Tassilo v. Parseval

Anno Siegel

John Bokma

Bryan Castillo

John Bokma

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads