Semantics of threads

B

Bo Lindbergh

perldoc perlthrtut says:
Thinking of mixing fork() and threads? Please lie down and wait until
the feeling passes. Be aware that the semantics of fork() vary between
platforms. For example, some UNIX systems copy all the current threads
into the child process, while others only copy the thread that called
fork(). You have been warned!

Does this mean that if more than one thread exists, the qx operator,
the system function, and the open function in pipe mode all have
undefined behaviour?


/Bo Lindbergh
 
B

Ben Morrow

Quoth Bo Lindbergh said:
perldoc perlthrtut says:

Does this mean that if more than one thread exists, the qx operator,
the system function, and the open function in pipe mode all have
undefined behaviour?

No. A Unix system destroys all threads but the main one when exec(2) is
called; on Windows qx/system/open '|-' use the win32 spawn (or maybe
CreateProcess? It comes to the same thing) function anyway.

Ben
 
X

xhoster

Bo Lindbergh said:
perldoc perlthrtut says:

Does this mean that if more than one thread exists, the qx operator,
the system function, and the open function in pipe mode all have
undefined behaviour?

Excluding bugs, no, the behaviour is not undefined. Those operations are
not implemented merely as naive forks. They are implemented in different
ways on different machines, and (attempt to) take care of such issues.

Xho
 
B

Bo Lindbergh

Excluding bugs, no, the behaviour is not undefined. Those operations are
not implemented merely as naive forks. They are implemented in different
ways on different machines, and (attempt to) take care of such issues.

Really? I looked at the source and found no attempts to block other
threads from running between the fork and the exec.


/Bo Lindbergh
 
X

xhoster

Bo Lindbergh said:
Really? I looked at the source and found no attempts to block other
threads from running between the fork and the exec.

It would really help to know what version of the source you looked at, and
the file(s) and line numbers. Anyway, I think there are some destruction
flags that need to be cleared/bypassed on a fork done within a thread to
avoid problems, which are inherently bypassed by doing an exec so in that
case perhaps no other special code is needed. The key (under unix) seems
to be to explicitly exit the forked child, and not let it run off the end
of the thread. But I wouldn't gaurantee that to always work (and I
generally just avoid threads anyway on unix, so my experience is limited).


perl-5.8.8]$ perl -le 'use threads; async{warn "In thread"; sleep 2; if
(fork) { print "a"} else {}}->join(); print "b"; sleep 5'

In thread at -e line 1.
Unbalanced scopes: 3 more ENTERs than LEAVEs
Unbalanced saves: 8 more saves than restores
Unbalanced context: 2 more PUSHes than POPs
Segmentation fault (core dumped)

Adding an exit to the empty "else {}" solves this problem. However,
removing the "sleep 2;" also solves the problem, for reasons I don't
understand.

Xho
 
B

Bo Lindbergh

It would really help to know what version of the source you looked at, and
the file(s) and line numbers

5.9.3, pp_sys.c, line 3950: start of pp_system.
line 3956: taint handling.
line 3966: output buffer flushing.
line 3975: fork attempts start here.
No sign of any thread-related stuff.


/Bo Lindbergh
 
X

xhoster

Bo Lindbergh said:
5.9.3, pp_sys.c, line 3950: start of pp_system.
line 3956: taint handling.
line 3966: output buffer flushing.
line 3975: fork attempts start here.

But only if the #ifdef is satisfied.
No sign of any thread-related stuff.

Actually there is, if you dig down into the guts of the
PerlProc_fork procedure (or the Perl_my_fork procedure, which is what
it seems to resolve to). For all I know, that stuff is just a no-op on
some systems, but it is present for when it is needed.

Anyway, if you compare pp_fork and pp_system, you will see that pp_system
does not simply call pp_fork, and looks quite different from it. Once you
resolve all the indirection, it may turn out that they do the same thing on
your OS, but if so that is only because that is what works best for your
system. Sometimes doing the right thing and doing the naive thing turn out
to be the same thing--the difference is in whether you know it is the right
thing in your case.

Xho
 
B

Ben Morrow

Quoth Bo Lindbergh said:
Really? I looked at the source and found no attempts to block other
threads from running between the fork and the exec.

I think you're suffering from the same misconception as I was: fork
doesn't duplicate all the threads in the process, only the calling
thread. You get a single new thread in the new process, which is why you
need pthread_atfork, to clear up any mess (mutexes) left by the threads
which no longer exist in this process.

Otherwise, yes, there would be a race condition between fork and exec,
where another thread could do something significant (IO, whatever) that
wouldn't be destroyed by the exec.

Ben
 
B

Bo Lindbergh

Ben Morrow said:
I think you're suffering from the same misconception as I was: fork
doesn't duplicate all the threads in the process, only the calling
thread.

So you claim that it's a documentation error: the "some UNIX systems
copy all the current threads into the child process" part should be
removed from perlthrtut?


/Bo Lindbergh
 
B

Ben Morrow

Quoth Bo Lindbergh said:
So you claim that it's a documentation error: the "some UNIX systems
copy all the current threads into the child process" part should be
removed from perlthrtut?

Err.... dunno. I know that's the case with pthreads as specced, but I
also know a lot of (mostly older) Unices don't follow (or predate) the
standard. I would be inclined to trust p5p to be more likely to get this
stuff right than me; and you *are* supposed to be able to call system
from a threaded Perl program, so I'd be seriously surprised if there's a
major problem with it hasn't been spotted yet.

Ben
 
B

Bo Lindbergh

and you *are* supposed to be able to call system
from a threaded Perl program

.... but only from the main thread. Try this snippet:
{
use threads;

$ENV{FOOGLE}="main thread value";
async {
$ENV{FOOGLE}="new thread value";
system("printenv FOOGLE"); # outputs "main thread value"
}->join();
}

That is, you can't control the environment of any program you start
from a non-main thread. There are already the "system PROGRAM LIST"
and "system LIST" variations, so there's no reasonable syntax for
an explicit environment parameter either.


/Bo Lindbergh
 
B

Ben Morrow

Quoth Bo Lindbergh said:
... but only from the main thread. Try this snippet:
{
use threads;

$ENV{FOOGLE}="main thread value";
async {
$ENV{FOOGLE}="new thread value";
system("printenv FOOGLE"); # outputs "main thread value"
}->join();
}

That is, you can't control the environment of any program you start
from a non-main thread. There are already the "system PROGRAM LIST"
and "system LIST" variations, so there's no reasonable syntax for
an explicit environment parameter either.

Well, a process only has one environment, so assignments to %ENV in
subthreads aren't going to do what you think. This is not really a
problem with system per se, more with %ENV. I think there would be a
good case for saying %ENV should be process-global, with all the races
that implies: after all, that's how the environment actually works.

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top