pid from startet process

C

carlo.maier

hi,

i am starting an exectuable on hp-ux (ActivePerl Build 817) from a
perl script and need to know the childs pid.

my $cmd="myexecutable -P parmfile "

my $pid=open(H, "${cmd} |");
print "\ttest: $_" while (<H>);
close(H);

The executable starts, but the pid i am getting is not right. The
executable writes a logfile with the pid as part of the logfile name.
After the executable returns, i want to check the logfile:

my $logfile="${logdir}/program_${pid}.log";
open(H,"< $logfile") or die "Can't open logfile '${logfile}': ($!)\n";
while (<H>) {
print if (/ENTRY/);
}
close (H);

For the reason that the pid, that i am trying to catch with the first
open-statement, is not correct, i have a problem to check the logfile.

What kind of reasons can be imagined, that the method "$pid=open(H, "$
{cmd} |");" doesn't supply the correct pid?
Does anybody know about a method that gives me the correct pid?

Thanks in adance,
carlo
 
M

Mumia W.

[...]
What kind of reasons can be imagined, that the method "$pid=open(H, "$
{cmd} |");" doesn't supply the correct pid?
Does anybody know about a method that gives me the correct pid?

Thanks in adance,
carlo

Perhaps "myexecutable" is a script that starts the real executable.

Although it's just guessing, the real PID might be $pid+1.
 
X

xhoster

hi,

i am starting an exectuable on hp-ux (ActivePerl Build 817) from a
perl script and need to know the childs pid.

my $cmd="myexecutable -P parmfile "

my $pid=open(H, "${cmd} |");
print "\ttest: $_" while (<H>);
close(H);

The executable starts, but the pid i am getting is not right.

I don't see that problem.

$ perl -le 'print open my $fh, q{perl -le "print \$$;" |}; print <$fh>'
11050
11050

Maybe your $cmd isn't doing what you think it is doing.

Also, you might want to try using the 3-or-more argument form
of open:

open my $fh, "-|", "$cmd1", @cmd_args;

Xho
 
C

comp.llang.perl.moderated

...
What kind of reasons can be imagined, that the method "$pid=open(H, "$
{cmd} |");" doesn't supply the correct pid?
Does anybody know about a method that gives me the correct pid?

If 'myexecutable' starts another script as
suggested and you're able to tinker with
myexecutable, launching the secondary script
via 'exec' will preserve the $pid.

hth,
 
D

Douglas Wells

I don't see that problem.

$ perl -le 'print open my $fh, q{perl -le "print \$$;" |}; print <$fh>'
11050
11050

You don't see the problem because you're not doing the same thing
as the original poster.

The original poster's basic problem is that this form of open
(using '|') almost certainly invokes the system shell in order to
execute the command. That shell then creates a new process to
execute the command named in the $cmd parameter. The Perl open
function returns the PID of the shell, but the child (and the
relevant program) will have a different PID.

(And in response to another comment in this thread, the PID of
child will almost certainly not be the shell's PID plus 1. Any
system that does that presents a major security risk and should
be trashed immediately -- because PIDs would then be guessable.)

What you have done is to present to the shell a command (print)
that it can execute internally. Thus, it does not need to create
a child to execute the command. Even if that weren't the case,
you have used a construct ($$) that gets expanded at the shell
level before the subprocess is invoked. (Note you have actually
quoted that construct in several ways, but you have only protected
it against expansion in the shell that invoked perl and within
perl itself. To protect it against the shell that is created by
the open function would require several more backslashes and lots
of experimentation.)
Maybe your $cmd isn't doing what you think it is doing.

Also, you might want to try using the 3-or-more argument form
of open:

open my $fh, "-|", "$cmd1", @cmd_args;

That will exhibit the same problem.

To the OP: What you are trying to do is rather difficult and can
be system-dependent. There are several approaches that are often
used:

- Many subsystems avoid this situation by allowing you to specify
the name of the logfile as a parameter, either in the command
line or in the parameter file.

- You might be able to avoid the creation of the shell by using
the perl fork and exec built-in functions. This will require
knowledge about the underlying POSIX functions.

- You might perform some shell trickery whereby the newly
created shell would create the child asynchronously, store the
PID someplace that you could read it (e.g., in a file or as
the first returned line in the output). That would require
a command something like "($cmd & echo PID=$$ ; wait) |" -- but
that would require a higher comfort level with shell programming.
(And don't forget about the need for quoting within Perl.)

- This is a fairly common problem. Someone may have actually
written a package for this, but I am not familiar with any
(having normally avoided the problem via the above solutions).
It would probably be profitable for you to do some searching.

Good luck,

- dmw
 
A

anno4000

(And in response to another comment in this thread, the PID of
child will almost certainly not be the shell's PID plus 1. Any
system that does that presents a major security risk and should
be trashed immediately -- because PIDs would then be guessable.)

It's the other way around. Anything that relies on the non-predict-
ability of PIDs is insecure, unless the system guarantees that property
in a quantifiable way. I've never heard of that.

Many Unix systems assign PIDs sequentially (usually skipping those
that have been used recently, in some sense).

Anno
 
A

anno4000

hi,

i am starting an exectuable on hp-ux (ActivePerl Build 817) from a
perl script and need to know the childs pid.

my $cmd="myexecutable -P parmfile "

my $pid=open(H, "${cmd} |");
print "\ttest: $_" while (<H>);
close(H);

The executable starts, but the pid i am getting is not right. The
executable writes a logfile with the pid as part of the logfile name.
After the executable returns, i want to check the logfile:

my $logfile="${logdir}/program_${pid}.log";
open(H,"< $logfile") or die "Can't open logfile '${logfile}': ($!)\n";
while (<H>) {
print if (/ENTRY/);
}
close (H);

For the reason that the pid, that i am trying to catch with the first
open-statement, is not correct, i have a problem to check the logfile.

What kind of reasons can be imagined, that the method "$pid=open(H, "$
{cmd} |");" doesn't supply the correct pid?
Does anybody know about a method that gives me the correct pid?

The problem (or one compounding problem) seems to be that Perl uses
a shell or not to run the external program, depending on whether the
string contains shell metacharacters. It may depend on other factors
too.

As has been suggested in the thread, exec() could be helpful here. For
instance, *if* it's a perl-interpolated shell that is the problem, you
can change the command as in $cmd = "exec $cmd". Then your command
will run under the shell's PID.

Anno
 
X

xhoster

You don't see the problem because you're not doing the same thing
as the original poster.

Obviously I'm not doing the exact thing as he is, because I don't
have access to his program "myexecutable -P parmfile". However, I am doing
the same thing as far as Perl is concerned.
The original poster's basic problem is that this form of open
(using '|') almost certainly invokes the system shell in order to
execute the command.

Right. As does my example.

And the shell executes the command. Specifically, it "exec"s the command,
which means it doesn't change the pid. That is what my example shows, too.
(And in response to another comment in this thread, the PID of
child will almost certainly not be the shell's PID plus 1.

I am far from certain of that. In fact, in 2 out of 3 OS I tested it on,
after arranging for it to fork rather than exec so the pid changes, it was
in fact the old pid plus 1.

Any
system that does that presents a major security risk and should
be trashed immediately -- because PIDs would then be guessable.)

Sorry, but I have no intention of trashing my systems.
What you have done is to present to the shell a command (print)
that it can execute internally.

No, I presented the shell with a one line Perl script. The "print" is a
Perl print, not a shell print.
Thus, it does not need to create
a child to execute the command.

Whether or not it *needs* to do so, that is what it actually does; strace
verifies this.
Even if that weren't the case,
you have used a construct ($$) that gets expanded at the shell
level before the subprocess is invoked.

I back-whacked the first $ for a reason. And strace verifies that
the expansion does not occur in either of the shells, but is passed to the
inner Perl as intended.
(Note you have actually
quoted that construct in several ways, but you have only protected
it against expansion in the shell that invoked perl and within
perl itself. To protect it against the shell that is created by
the open function would require several more backslashes and lots
of experimentation.)

I did the experimentation. The escaping I supplied is appropriate.

That will exhibit the same problem.

That would depend on what the actual problem is.

Xho
 
D

Douglas Wells

It's the other way around. Anything that relies on the non-predict-
ability of PIDs is insecure, unless the system guarantees that property
in a quantifiable way. I've never heard of that.

Many Unix systems assign PIDs sequentially (usually skipping those
that have been used recently, in some sense).

Anno

You're partially right. Going back and checking I find that indeed
a number, perhaps even most, POSIX/Linux systems do assign PIDs
sequentially. So, I was wrong about that.

But, it is still a potential security vulnerability, and high
integrity systems don't do that. Let me explain (taking some
liberties with terminology for purposes of simplification).

Let's look at three levels of security-awareness in an OS environment.

Basic Security: It's possible for a highly knowledgeable engineer
to create an application that does not disclose information to
unauthorized subjects.
Example feature: Before the creation of the O_EXCL flag
to open(2) (originally creat(2)), it was almost impossible
to securely create a file in a shared directory.

General Security: The standard system tools support the average
programmer in the creation of secure applications.
Example feature: The invention of the mkstemp(3) subroutine
made it convenient to securely create a temporary file. It was
possible to properly use the earlier mktemp(3) subroutine, but
many programmers got it wrong.

Proactive Security: The system provides mechanisms that protect
against some design errors in applications.
Example feature: The removal of execute access from the stack.
This is not strictly necessary, and even disallows some
desirable features, but it protects against a common programming
error (buffer overflow).

Now lets look at guessable PIDs. Note that some shell scripts
(and even some programs) have a construct similar to:

TMPFILE=/tmp/mycmd.$$
(echo cmd1 ; echo cmd2 ) > $TMPFILE
sh $TMPFILE

If I find that a privileged process is running such a shell script,
I can precreate the file (by guessing the PID that the shell will
soon be using(*)). Once I have the privileged process executing
commands from a file that I own or have write privileges I can
easily compromise that level of privilege. While this particular
example is fairly blatant, numerous less obvious constructs are
equally vulnerable.

That's why guessable PIDs are a security vulnerability. And I'll
note that your statement that "Anything that relies on the
non-predictability of PIDs is insecure ..." is also true. The two
concepts are not contradictory once you accept that secureness is
not a binary attribute.

There's actually quite of bit of history of exploits using techniques
similar to this. For example, the mkstemp subroutine was added
to UNIX systems after a number of exploits or mktemp became known.
I won't get into lots of details here as we're rapidly getting off
topic -- although Perl scripts are subject to the same vulnerabilities
as the shell.

- dmw

(*) Actually I would precreate a lot of files to increase my chances
of guessing the right one. For example (in pseudo-code),
for PIDnum = getpid () to getpid () + 100
create /tmp/mycmd.PIDnum
 
A

anno4000

Douglas Wells said:
You're partially right. Going back and checking I find that indeed
a number, perhaps even most, POSIX/Linux systems do assign PIDs
sequentially. So, I was wrong about that.

But, it is still a potential security vulnerability, and high
integrity systems don't do that. Let me explain (taking some
liberties with terminology for purposes of simplification).

Let's look at three levels of security-awareness in an OS environment.

[well known facts and concepts snipped]
Now lets look at guessable PIDs. Note that some shell scripts
(and even some programs) have a construct similar to:

TMPFILE=/tmp/mycmd.$$
(echo cmd1 ; echo cmd2 ) > $TMPFILE
sh $TMPFILE

Yes, that's widely known to be vulnerable. Using it means to
rely on the un-guessability of PIDs, so it's out when security
is a concern. There are ways to create temp files safely.

Code that uses this construct must be considered unsafe. It doesn't
become safe by stopping this particular hole by changing the PID
assignment strategy. I'm not saying that PID randomization is a
bad thing. To rely on it is.

Anno
 
P

Peter J. Holzer

(And in response to another comment in this thread, the PID of
child will almost certainly not be the shell's PID plus 1. Any
system that does that presents a major security risk and should
be trashed immediately -- because PIDs would then be guessable.)

PIDs are guessable on the majority of unix systems (the only exceptions
I know are some BSD variants) - so this is something the average unix
programmer expects.

OTOH, a random pid is something the average unix programmer does not
expect - which may lead to a different class of (possibly
security-critical) errors.

hp
 
D

Douglas Wells

PIDs are guessable on the majority of unix systems (the only exceptions
I know are some BSD variants) - so this is something the average unix
programmer expects.

Yes, elsewhere in this thread, I acknowledged that I made a wrong
declaration about "most" POSIX-like system: Many of them do
generate new candidates for PIDs by incrementing a counter. I
have not, however, yielded on the claim of a security threat posed
by this algorithm. Instead, I supplied a scenario, based on
historical security incidents, that posed a threat in the presence
of this algorithm.
OTOH, a random pid is something the average unix programmer does not
expect - which may lead to a different class of (possibly
security-critical) errors.

I'm sorry, but I find that (the security-critical error possibility)
preposterous. Can you provide a scenario that both leads to correct
behavior in the presence of the incrementing PID algorithm and to
the presence of a security threat in its absence?

In fact, I'd like to hear of a scenario that leads to non-probabilistic
correct behavior in the presence of an application-level program
that predicts future PIDs based on any algorithm. Can you provide
one?

I can imagine a frazzled human debugging a multi-process program,
operating in a benign environment, and choosing to attach the
the "next" PID based on a mental calculation. But, I fail to
understand how that could work reasonably in a deployed system.

It appears to me any user-level algorithm for predicting the "next"
PID, such as adding one to the PID of the current process, faces
several difficulties:

- The system could well have created a non-related process with
the "next" PID in the meanwhile.

- The current sequence may have reached the maximum PID value and
wrap around. Neither the maximum value nor the initial value
after wrap-around is specified by the POSIX standard (nor by
the Linux standard, which seems to defer to POSIX in this
instance).

- Even if neither of those events has occurred, the "next" PID
might not be usable due to the existence of a long-lived process
with that ID, and the standard's requirement that PIDs be
temporally unique.

- Even if there isn't an existing process with the same PID,
that PID value might not be usable due to the standard's
"Process ID Reuse" prohibition on reuse of process group ids.

Can you offer an user-level algorithm that alleviates the effect
of those problems?

Also, do you actually know of any interesting applications that
would significantly benefit by knowing the PID of a process that
has yet to be created?

- dmw
 
J

Josef Moellers

Peter said:
OTOH, a random pid is something the average unix programmer does not
expect - which may lead to a different class of (possibly
security-critical) errors.

Sorry, but I have never seen a piece of (*x-) software that relies on
some relation between PIDs. It will break as soon as you have a wrap-around.

Getting your own PID is a very simple call, getting your child's pid a
result of the fork() system call. I don't see how you can put any
constraint on one or the other to make it appear non-random!

Josef
 
P

Peter J. Holzer

Yes, elsewhere in this thread, I acknowledged that I made a wrong
declaration about "most" POSIX-like system: Many of them do
generate new candidates for PIDs by incrementing a counter. I
have not, however, yielded on the claim of a security threat posed
by this algorithm. Instead, I supplied a scenario, based on
historical security incidents, that posed a threat in the presence
of this algorithm.

Yes. If you had yielded that claim I wouldn't have had any reason to
answer. But since you haven't, I had to object.

The pid has traditionally always been a simple wrap-around counter. Any
unix programmer should know this. Using it in a context where a random
number (much less a cryptographically strong random number) is required
is just using the wrong tool for the job. Such an error may lead to a
security problem, but that's the fault of the programmer, not the tool.
(In your scenario, the real error is probably not using O_EXCL, btw, not
using the pid, but that depends on the intended use).

I'm sorry, but I find that (the security-critical error possibility)
preposterous. Can you provide a scenario that both leads to correct
behavior in the presence of the incrementing PID algorithm and to
the presence of a security threat in its absence?

Yes. A linearly incrementing pid has some minimum time between the start
time of two processes with the same pid. Typically, about 30000 forks
are needed before the same pid can be reused.

A programmer which knows this may assume that the start time of the
process with suitably high resolution (for example 1 millisecond - 1
second is already too grainy given current computer speeds) together
with the pid is always unique: The system would need to be able to fork
30000 processes within 1 ms, which is far beyond the capabilities of
current systems and will stay impossible for some time.

A randomized pid breaks this assumption: The pid may be (with some small
probability, but nonetheless) reused immediately. Thus it is possible
that two processes which are started in the same millisecond get the
same "unique" id. What happens then depends on the application: Maybe
nothing, maybe some data gets mixed up, maybe some data vanishes ...

I know a number of applications which didn't work correctly on BSD
systems after randomized pids were introduced. The results were usually
lost data or data leaked to a different user, so that was at least
potentially security-critical.

Personally, I think it's a good thing that these applications were
broken, because it alerted the maintainers that the assumption
"timestamp + pid is unique" was faulty way before it became faulty on
systems on which it may have been possible to systematically exploit the
bug (IIRC all these applications used a one-second timestamp which is
now getting too short).

I'm not saying that random pids are bad /per se/. But the average unix
programmer probably doesn't know that they exist so he cannot consider
their consequences.

(As an aside: Is the randomness of the pids on BSD systems
cryptographically strong? If not, a programmer might assume they are and
make the same error as a programmer who thinks that linearly incremented
pids are "non-predictable". If they are, how about other systems with
random pids?)


[rest of posting snipped, as it was completely beside the point I was
trying to make]

hp
 
P

Peter J. Holzer

Sorry, but I have never seen a piece of (*x-) software that relies on
some relation between PIDs.

See for example the original definition of the maildir format. Any
software which still uses that format is broken on systems which can
reuse the same pid within one second.
It will break as soon as you have a wrap-around.

No. That is expected and handled. The relation which the application
relies on is "two processes which are alive within the same second
cannot have the same pid".

This assumption is of course faulty anyway (there may already be systems
which can fork 30000 times per second, and it may be possible to delay a
process before generating the timestamp on slower systems), but the fact
is that the first systems where this problem was noticed (with real lost
mail, not just as a theoretical possibility) were systems with
randomized pids - because it is many orders of magnitude more likely to
happen on such systems.

hp
 
J

Josef Moellers

Peter said:
See for example the original definition of the maildir format. Any
software which still uses that format is broken on systems which can
reuse the same pid within one second.

OK, I see. So it's not really the fact that PIDs are supposed to be
monotonic increasing (apart from one place), but the fact that, when
using random PIDs, a PID can be re-used before all the other PIDs were
(re-)used. Indeed, with random PIDs, it would not be impossible for a
process to get the PID of a process which has just died.
OTOH it would work if one could guarantee that a recently deceased
process' PID will be the last PID (re-)used of all available PIDs at
that moment, e.g. by putting PIDs into a FIFO.

I stand corrected,

Josef
 
D

Douglas Wells

Yes. If you had yielded that claim I wouldn't have had any reason to
answer. But since you haven't, I had to object.

And I still don't. You are describing a situation where an
application programmer makes an erroneous assumption, and that
assumption results in a security problem.
The pid has traditionally always been a simple wrap-around counter. Any
unix programmer should know this.

That depends on your definition of "traditionally." Yes, the
standard code base from AT&T did this, and many of the derivative
systems also did this. However, that is irrelevant, as it was not
guaranteed by either the system specification or a published
standard. (And, I know of systems at least as far back as 1982 that
used non-sequential PIDs.). A programmer may know that this is the
way that things often work, but he/she shouldn't depend on something
that isn't *documented* as a permanent characteristic of the
system. And, if you are targeting a portable platform, such as
POSIX, you need to be aware of possible differences across
implementations of that platform.
Using it in a context where a random
number (much less a cryptographically strong random number) is required
is just using the wrong tool for the job. Such an error may lead to a
security problem, but that's the fault of the programmer, not the tool.
(In your scenario, the real error is probably not using O_EXCL, btw, not
using the pid, but that depends on the intended use).

Yes, *one* underlying error was not using O_EXCL -- as I implied
by my reference to O_EXCL in that post.

As decoded in a parallel sub-thread, it would appear that a real
problem underlying your concern was a dependence on a time delay
before reuse of PIDs. That characteristic, of course, is not
currently a guarantee of the POSIX/UNIX standard, and I can find
no instances of any system documentation promising that.

The question we have before us seems to be how to attribute the
blame of being a security flaw. Consider the case of using an
uninitialized variable in a program. It can be the case that that
variable happens to inherit a value that causes the the program
to behave reasonably. As time elapses that accidental value can
change for any number of reasons: the level of optimization changes,
the compiler changes, other code in the program changes and happens
to leave a different value in that location, or the program is
moved to a different platform.

When that program later misbehaves, leaving a loss of data (and a
security problem using your definition), one shouldn't point to
the new compiler or the different platform as the security problem.
The security flaw is the improper coding in the original program.

In the case of the example that you provide in a parallel sub-thread,
the basic security flaw is clearly at the hands of the designers
of the maildir format (assuming your description is correct, which
I don't doubt). They made an assumption that wasn't true across
the targeted platforms.

Security experts, for both computers and otherwise, note that
vulnerability is the product of the existence of security
vulnerabilities and the presence of security threats. In this
case, the introduction of a system with randomized PIDs increased
the level of security threat, but that doesn't mean that the use
of randomized PIDs is itself a security threat, which is an
implication, but not explicit claim, of your earlier message to
which I was reacting.

So, with that said, let me react to the rest of your message:
Yes. A linearly incrementing pid has some minimum time between the start
time of two processes with the same pid. Typically, about 30000 forks
are needed before the same pid can be reused.

A programmer which knows this may assume that the start time of the
process with suitably high resolution (for example 1 millisecond - 1
second is already too grainy given current computer speeds) together
with the pid is always unique: The system would need to be able to fork
30000 processes within 1 ms, which is far beyond the capabilities of
current systems and will stay impossible for some time.

Right, and that programmer is creating a bug by depending on a
characteristic that is not guaranteed by any spec.
I know a number of applications which didn't work correctly on BSD
systems after randomized pids were introduced. The results were usually
lost data or data leaked to a different user, so that was at least
potentially security-critical.

Agreed, and the security problem was in the application. The
introduction of the use of the "BSD" system (I'm presuming that
this was likely OpenBSD) increased the security threat by altering
more characteristics of the platform.
Personally, I think it's a good thing that these applications were
broken, because it alerted the maintainers that the assumption
"timestamp + pid is unique" was faulty way before it became faulty on
systems on which it may have been possible to systematically exploit the
bug (IIRC all these applications used a one-second timestamp which is
now getting too short).

I'm not saying that random pids are bad /per se/. But the average unix
programmer probably doesn't know that they exist so he cannot consider
their consequences.

OK, and the way to deal with that is to teach those programmers
how to write portable applications by not depending on accidental
characteristics.
(As an aside: Is the randomness of the pids on BSD systems
cryptographically strong? If not, a programmer might assume they are and
make the same error as a programmer who thinks that linearly incremented
pids are "non-predictable". If they are, how about other systems with
random pids?)

I don't know about those particular systems, but often such algorithms
are pluggable and can be replaced depending upon the security
requirements of the deployed environment. Given that the rate of
production of PIDs is slower, and the visibility of PIDs is lower
than communication data, the tolerable level of strength would be
lower.

- dmw
 
P

Peter J. Holzer

[I'm sorry, this posting grew too long and quite repetitive. But it's
rather late and I'm too lazy now to trim it down. So bear with me or
feel free to skip it]

And I still don't. You are describing a situation where an
application programmer makes an erroneous assumption, and that
assumption results in a security problem.

Same as you did. The assumption that a pid is unguessable is just as
erroneous, and it is very simple to refute on almost any Unix system. So
it is a problem that the average Unix programmer should be aware of and
he can take appropriate measures (which he still has to do even if his
system uses randomized pids: a) because he cannot rely on this feature
and b) because an attacker may still guess the pid because of its
limited range (typically 30k). So I stand by my claim that sequentially
allocated PIDs are not a "major security threat", and I would even guess
that randomized pids add very little security. (The grsecurity people
seem to agree - randomized PIDs were removed again)

That depends on your definition of "traditionally." Yes, the
standard code base from AT&T did this, and many of the derivative
systems also did this. However, that is irrelevant, as it was not
guaranteed by either the system specification or a published
standard.

You are right that the standard doesn't specify this, and the printed
docs I still have for PC/IX (which are probably quite close to those of
Version 7 Unix) don't specify it either.

However, as you say, AT&T Unix did it this way and the vast majority of
derivatives and clones followed this example. This behaviour was also
documented in text books over and over (for example in Banahan and
Rutter: UNIX - the Book (1982), which happens to be the book I learned
Unix from together with the aforementioned PC/IX manpages 20 years ago).
So if a programmer finds some behaviour documented and all systems he
has access to behave that way, why should he suspect that his books are
wrong? (Because all textbooks contain errors and should not be used as
reference, you say, and you are of course entirely correct)

(And, I know of systems at least as far back as 1982 that
used non-sequential PIDs.).

I don't doubt this, although the first system I was aware of using that
feature was OpenBSD (since at least 1999). I do think that these systems
were rather obscure, though, and most unix programmers were not aware of
their existence.
A programmer may know that this is the way that things often work, but
he/she shouldn't depend on something that isn't *documented* as a
permanent characteristic of the system.

Unfortunately, while it certainly wasn't documented in POSIX and
probably wasn't documented in any specific system documentations
(although it is quite possible that some unixes did document their pid
generation scheme), it *was* "documented" in lots of secondary
literature, and programmers would pick up the wrong assumption from
there.

Yes, *one* underlying error was not using O_EXCL -- as I implied
by my reference to O_EXCL in that post.

As decoded in a parallel sub-thread, it would appear that a real
problem underlying your concern was a dependence on a time delay
before reuse of PIDs.

My "concern" is not that randomized pids are somehow worse than
sequential pids. My "concern" is your claim that sequential pids are a
major security threat and that all systems using it should be trashed.

I am pointing out that unix programmers know that pids can be guessed,
so this is not a major security threat.

I am further pointing out that unix programmers very often do not know
that sequential pids are not universal. So the slight advantage of
having pids which are slightly harder to guess is offset by behaviour
which is not expected by the programmer. Of course it's the programmer's
fault if he assumes that pids are assigned sequentially. But it is also
his fault if he creates a file in /tmp without using O_EXCL and checking
the the return value. I fail to see why one should be a "major security
threat of the system" and the other the programmer's fault. Failure to
deal correctly with sequential PIDs is just as much the programmer's
fault as failure to deal with randomized PIDs. In fact one could argue
that the former is more the programmers fault, since it is the usual
case.

OK, and the way to deal with that is to teach those programmers
how to write portable applications by not depending on accidental
characteristics.

Right. Just as we have to teach programmers to use File::Temp for
temporary files (or use lower level functions properly). The
non-existence of "/tmp/$progname.$$" is an accidental characteristic
which a programmer must not rely on. Not for portable applications and
not even for non-portable ones.

I don't know about those particular systems, but often such algorithms
are pluggable and can be replaced depending upon the security
requirements of the deployed environment. Given that the rate of
production of PIDs is slower, and the visibility of PIDs is lower
than communication data, the tolerable level of strength would be
lower.

My point here is that a programmer cannot rely on PIDs being
unguessable. A program which relies on unguessable PIDs is just as buggy
on a system with random PIDs as it is on a system with sequential PIDs.
A programmer may be lured into a false sense of security by apparently
random pids which aren't. Even if the pids on his system are strong
(lets for a moment assume that the system uses 64 bit numbers from a
sufficiently large entropy pool), the program may be ported to another
system which maybe uses only 15 bit numbers from a 32-bit linear
congruental PRNG, or, worse, to a system with sequential PIDs.

hp
 
D

Douglas Wells

[ ... ] The assumption that a pid is unguessable is just as
erroneous, and it is very simple to refute on almost any Unix system. So
it is a problem that the average Unix programmer should be aware of and
he can take appropriate measures (which he still has to do even if his
system uses randomized pids: a) because he cannot rely on this feature
and b) because an attacker may still guess the pid because of its
limited range (typically 30k). So I stand by my claim that sequentially
allocated PIDs are not a "major security threat", and I would even guess
that randomized pids add very little security. (The grsecurity people
seem to agree - randomized PIDs were removed again)

Ahh, I see that you were reacting to the words "major security
threat." I don't see where you made that point explicitly in the
earlier messages, and I didn't understand that it was one of your
contention points. So, let me back off on the "major" adjective.
I over-reacted to the earlier proposal that a programmer should
depend on the PID generation algorithm of a particular system.
You are right that the standard doesn't specify this, and the printed
docs I still have for PC/IX (which are probably quite close to those of
Version 7 Unix) don't specify it either.

However, as you say, AT&T Unix did it this way and the vast majority of
derivatives and clones followed this example. This behaviour was also
documented in text books over and over (for example in Banahan and
Rutter: UNIX - the Book (1982), which happens to be the book I learned
Unix from together with the aforementioned PC/IX manpages 20 years ago).
So if a programmer finds some behaviour documented and all systems he
has access to behave that way, why should he suspect that his books are
wrong? (Because all textbooks contain errors and should not be used as
reference, you say, and you are of course entirely correct)

OK. I think that we are in agreement on that. It seems to be a
characteristic of many humans that it's easiest to learn from an
example. Unfortunately, it also seems to be a trait that we make
assumptions and improperly extrapolate the specific to the general.

I have found 4 books in my library that explain the creation of
PIDs (Lyons; Leffler, McKusick, et al.; Bach; and Tanenbaum). All
explain the use of an incrementing counter; none provide any
assurance that the algorithm in guaranteed. However, I can
understand how someone relatively new to UNIX would assume that
this was the only way. (I'll also that there are differences in
the algorithms and none quite matches the use in current systems
that I am familiar with.)
My "concern" is not that randomized pids are somehow worse than
sequential pids. My "concern" is your claim that sequential pids are a
major security threat and that all systems using it should be trashed.

I have backed off on the particular claim of a "major threat. I
also noted in my earlier explanation that programs were no vulnerable
if they properly used O_EXCL in system calls, "no clobber" in the
shell, and mkstemp in programs.
I am pointing out that unix programmers know that pids can be guessed,
so this is not a major security threat.

Unfortunately, I have to disagree with this assertion. As an
exemplar, I will point out that the Perl documentation for "open"
contains the faulty construct, namely: open(EXTRACT, "|sort >Tmp$$")
(Granted this example doesn't necessarily put the file into a
shared directory, but it also doesn't warn about the situation.)
Right. Just as we have to teach programmers to use File::Temp for
temporary files (or use lower level functions properly). The
non-existence of "/tmp/$progname.$$" is an accidental characteristic
which a programmer must not rely on. Not for portable applications and
not even for non-portable ones.

I have to agree with that.
My point here is that a programmer cannot rely on PIDs being
unguessable. A program which relies on unguessable PIDs is just as buggy
on a system with random PIDs as it is on a system with sequential PIDs.
A programmer may be lured into a false sense of security by apparently
random pids which aren't. Even if the pids on his system are strong
(lets for a moment assume that the system uses 64 bit numbers from a
sufficiently large entropy pool), the program may be ported to another
system which maybe uses only 15 bit numbers from a 32-bit linear
congruental PRNG, or, worse, to a system with sequential PIDs.

I wasn't attempting to make that point, and, if fact, I was trying
to make the same point as you are here, so I think that we agree
on this.

So, let me try a refined assertion:

- The use of guessable PIDs leads to a higher level of security
threat on a system, most notably:

- in the presence of faulty programs that create files in
shared directories where the names depend on process ids.

- in certain aspects of high-assurance systems that we haven't
discussed here.

- dmw
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top