piped open and shell metacharacters

John Kelly · Jul 30, 2010

The Camel book, 16.3.1. Anonymous Pipes says:

Perl uses your default system shell (/bin/sh on Unix) whenever a pipe
command contains special characters that the shell cares about. If
you're only starting one command, and you don't need--or don't want--to
use the shell, you can use the multi-argument form of a piped open ...

... But then you don't get I/O redirection, wildcard expansion, or
multistage pipes, since Perl relies on your shell to do those.

and 29.2.104. open says

Any pipe command containing shell metacharacters such as wildcards or
I/O redirections is passed to your system's canonical shell (/bin/sh on
Unix), so those shell-specific constructs can be processed first. If no
metacharacters are found, Perl launches the new process itself without
calling the shell.

I have a script that traps the standard output of any command passed in
as args to the script. My piped open uses 2>&1 to grab stderr as well
as stdout. I thought > was a shell metacharacter so I expected to see
/bin/sh between my script and the trapped command when doing ps ax. But
in many cases, Perl runs the trapped command directly, without needing
/bin/sh.

You can see that by running the script like this:

../myscript sleep 10

and then doing a ps ax before the sleep ends. Is the book wrong, or I
am I missing something? Here is the script:

#!/usr/bin/perl

# Define author
# John Kelly, July 28, 2010

# Define copyright
# Copyright John Kelly, 2010. All rights reserved.

# Define license
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this work except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0

# Define symbols and (words)
# OT ........... Output Trap
# bas0 ......... basename of $0
# binx ......... binary executable
# tt ........... temporary time

use strict;
use FileHandle;
use File::Basename;
use POSIX qw (strftime);

STDOUT->autoflush (1);
STDERR->autoflush (1);

my $bas0 = basename ($0);
my $binx;

unless ($binx = shift @ARGV) {
print "Usage: ", $bas0, " binary.executable [args]\n";
exit 1;
}

my $basx = basename ($binx);

if (!defined (open OT, "$binx @ARGV 2>&1 |")) {
printf "%s -> %s: failure starting %s: $!\n", &tt, $bas0, $binx;
exit 1;
}
while (<OT>) {
/^\s*$/ && next;
printf "%s -> %s: ", &tt, $basx;
print $_;
}
if (!(close OT) && $!) {
printf "%s -> %s: failure closing OT: $!\n", &tt, $bas0;
} else {
if ($? & 127) {
printf "%s -> %s: %s signal %d, %s coredump\n", &tt, $bas0, $basx,
($? & 127), ($? & 128) ? 'with' : 'without';
} else {
printf "%s -> %s: %s exit value %d\n", &tt, $bas0, $basx, $? >> 8;
}
}

sub tt {
strftime "%a %b %e %H:%M:%S %Z %Y", localtime;
}

Ilya Zakharevich · Jul 30, 2010

as args to the script. My piped open uses 2>&1 to grab stderr as well
as stdout. I thought > was a shell metacharacter so I expected to see
/bin/sh between my script and the trapped command when doing ps ax.

At least on Unix and OS/2, I implemented shell-less "2>&1". (For a
decade already I'm also sitting on code which does shell-less "FOO"
and 'bar', but have no time to run a test suite - so I never send it
to p5p.)

[As my other post this week shows, after I fixed the bug (in fact 2)
in the OS/2 branch I suspected for several years, my "stress test" now
passes on OS/2. On other architectures, Perl's pipe open() (and maybe
system() too???) is/are seriously buggy...]

Hope this helps,
Ilya

John Kelly · Jul 30, 2010

on Unix and OS/2, I implemented shell-less "2>&1"

That's what I wondered. Makes sense to avoid an extra shell pid when
2>&1 is the only shell metacharacter present.

Works for me.

C.DeRykus · Jul 31, 2010

The Camel book, 16.3.1. Anonymous Pipes says:

Perl uses your default system shell (/bin/sh on Unix) whenever a pipe
command contains special characters that the shell cares about. If
you're only starting one command, and you don't need--or don't want--to
use the shell, you can use the multi-argument form of a piped open ...
... But then you don't get I/O redirection, wildcard expansion, or
multistage pipes, since Perl relies on your shell to do those.

Click to expand...

and 29.2.104. open says

Any pipe command containing shell metacharacters such as wildcards or
I/O redirections is passed to your system's canonical shell (/bin/sh on
Unix), so those shell-specific constructs can be processed first. If no
metacharacters are found, Perl launches the new process itself without
calling the shell.

Click to expand...

I have a script that traps the standard output of any command passed in
as args to the script. My piped open uses 2>&1 to grab stderr as well
as stdout. I thought > was a shell metacharacter so I expected to see
/bin/sh between my script and the trapped command when doing ps ax. But
in many cases, Perl runs the trapped command directly, without needing
/bin/sh.

You can see that by running the script like this:

./myscript sleep 10

and then doing a ps ax before the sleep ends. Is the book wrong, or I
am I missing something? Here is the script:

#!/usr/bin/perl

# Define author
# John Kelly, July 28, 2010

# Define copyright
# Copyright John Kelly, 2010. All rights reserved.

# Define license
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this work except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0

# Define symbols and (words)
# OT ........... Output Trap
# bas0 ......... basename of $0
# binx ......... binary executable
# tt ........... temporary time

use strict;
use FileHandle;
use File::Basename;
use POSIX qw (strftime);

STDOUT->autoflush (1);
STDERR->autoflush (1);

my $bas0 = basename ($0);
my $binx;

unless ($binx = shift @ARGV) {
print "Usage: ", $bas0, " binary.executable [args]\n";
exit 1;

}

my $basx = basename ($binx);

my $kid; # edit #1

if (!defined ($kid = open OT, "$binx @ARGV 2>&1 |")) { # edit # 2
printf "%s -> %s: failure starting %s: $!\n", &tt, $bas0, $binx;
exit 1;}

print "parent shell=",getppid()," perl process=$$", # edit # 3
" perl kid=$kid\n";

while (<OT>) {
/^\s*$/ && next;
printf "%s -> %s: ", &tt, $basx;
print $_;}

if (!(close OT) && $!) {
printf "%s -> %s: failure closing OT: $!\n", &tt, $bas0;} else {

if ($? & 127) {
printf "%s -> %s: %s signal %d, %s coredump\n", &tt, $bas0, $basx,
($? & 127), ($? & 128) ? 'with' : 'without';
} else {
printf "%s -> %s: %s exit value %d\n", &tt, $bas0, $basx,$? >> 8;
}

}

sub tt {
strftime "%a %b %e %H:%M:%S %Z %Y", localtime;}

On FreeBSD (with small edits above), I don't see that
happening.

$ myscript.pl sleep 60
parent shell=71889 perl process=75147 perl kid=75148

$ ps -ax
PID TT STAT TIME COMMAND
71889 2 SNs 0:00.01 -bash (bash)
75147 2 SN+ 0:00.02 /usr/bin/perl ./shell.pl sleep 60
75148 2 SN+ 0:00.00 sleep 60
.....

I believe execl in the perl kid launches a shell
which then gets overlaid by sleep(). From doio.c:

PerlProc_execl(PL_sh_path, "sh", "-c", cmd, (char *)NULL);

John Kelly · Jul 31, 2010

On FreeBSD (with small edits above), I don't see that
happening.

$ myscript.pl sleep 60
parent shell=71889 perl process=75147 perl kid=75148

That's what I'm saying; the kid pid is only 1 greater than the perl pid,
which means there was never a shell pid launched.

$ ps -ax
PID TT STAT TIME COMMAND
71889 2 SNs 0:00.01 -bash (bash)
75147 2 SN+ 0:00.02 /usr/bin/perl ./shell.pl sleep 60
75148 2 SN+ 0:00.00 sleep 60
....

I believe execl in the perl kid launches a shell
which then gets overlaid by sleep().

I don't think so.

You could overlay the shell by prefixing the command with the "exec"
shell builtin, but I didn't do that.

Seems like perl is recognizing 2>&1 as a limited special case, copying
fd1 to fd2 , then exec'ing the binary directly, without using a shell.

C.DeRykus · Jul 31, 2010

JK> That's what I'm saying; the kid pid is only 1 greater than
JK> the perl pid, which means there was never a shell pid
JK> launched.JK> I don't think so.
JK>
JK> You could overlay the shell by prefixing the command
JK> with the "exec" shell builtin, but I didn't do that.
JK>
JK> Seems like perl is recognizing 2>&1 as a limited special
JK> case, copying
JK> fd1 to fd2 , then exec'ing the binary directly, without
JK> sing a shell.

Maybe I missed something but I anything supporting your
supposition in the source or docs:

perlfaq8:
If the second argument to a piped open() contains shell
metacharacters, perl fork()s, then exec()s a shell to
decode the metacharacters and eventually run the desired
program...

perlopentut:
But if the command contains special shell characters, such
as ">" or "*", called 'metacharacters', Perl does not execute
the command directly. Instead, Perl runs the shell, which then
tries to run the command.

John Kelly · Jul 31, 2010

JK> Seems like perl is recognizing 2>&1 as a limited special
JK> case, copying
JK> fd1 to fd2 , then exec'ing the binary directly, without
JK> sing a shell.

Maybe I missed something but I anything supporting your
supposition in the source or docs:

perlfaq8:
If the second argument to a piped open() contains shell
metacharacters, perl fork()s, then exec()s a shell to
decode the metacharacters and eventually run the desired
program...

perlopentut:
But if the command contains special shell characters, such
as ">" or "*", called 'metacharacters', Perl does not execute
the command directly. Instead, Perl runs the shell, which then
tries to run the command.

Seems the docs are incomplete.

Xho Jingleheimerschmidt · Jul 31, 2010

C.DeRykus said:
JK> That's what I'm saying; the kid pid is only 1 greater than
JK> the perl pid, which means there was never a shell pid
JK> launched.
JK> I don't think so.
JK>
JK> You could overlay the shell by prefixing the command
JK> with the "exec" shell builtin, but I didn't do that.
JK>
JK> Seems like perl is recognizing 2>&1 as a limited special
JK> case, copying
JK> fd1 to fd2 , then exec'ing the binary directly, without
JK> sing a shell.

Maybe I missed something but I anything supporting your
supposition in the source or docs:

I don't see it in the docs either. But I'm quite sure it is in the
source, as the behavior indubitably exists, as verified with strace, and
I doubt perl has a mind of its own.

C.DeRykus · Aug 1, 2010

I don't see it in the docs either. But I'm quite sure it is in the
source, as the behavior indubitably exists, as verified with strace, and
I doubt perl has a mind of its own.

Yes, IIUC, it looks like there is a dup and the shell's
bypassed:

From doio.c:

/* handle the 2>&1 construct at the end */
if (*s == '>' && s[1] == '&' && s[2] == '1'
...
if (!*t && (PerlLIO_dup2(1,2) != -1)) {
s[-2] = '\0';
break;
}
}

John Kelly · Aug 1, 2010

Yes, IIUC, it looks like there is a dup and the shell's
bypassed:

From doio.c:

/* handle the 2>&1 construct at the end */
if (*s == '>' && s[1] == '&' && s[2] == '1'
...
if (!*t && (PerlLIO_dup2(1,2) != -1)) {
s[-2] = '\0';
break;
}
}

Interesting it says "at the end"

I wonder if placing it elsewhere in the command will circumvent Perl's
shortcut and invoke a shell as usual.

Ilya Zakharevich · Aug 2, 2010

/* handle the 2>&1 construct at the end */
if (*s == '>' && s[1] == '&' && s[2] == '1'
...
if (!*t && (PerlLIO_dup2(1,2) != -1)) {
s[-2] = '\0';
break;
}
}

Click to expand...

Interesting it says "at the end"

I wonder if placing it elsewhere in the command will circumvent Perl's
shortcut and invoke a shell as usual.

At least with my initial implementation, only the position at the end
is special-cased.

Although your "as usual" looks suspicious. The *usual* case is a
shell-less execution. Only if shell metachars appear one is forced to
invoke the shell.

And, BTW, what would be your motivation to wish for a shell?

Yours,
Ilya

Alan Curry · Aug 2, 2010

It may be worth noting that a command matching /^\.\s/, /^exec\s/ or
/^\w*=/ will also be passed to the shell[1].

The mention of exec made me wonder about other shell builtins. All of the
following fail on my system, where they are builtin to /bin/sh and not
available as external commands:

perl -e 'system "eval ls /"'
perl -e 'system "command ls /"'
perl -e 'system "set"'
perl -e 'system "times"'
perl -e 'system "read foo"'

All of the above would produce output (or in the last case, consume input) if
they were executed properly. Luckily, none of them are likely to be used for
any reason except to demonstrate the fact that they don't work.

John Kelly · Aug 2, 2010

It may be worth noting that a command matching /^\.\s/, /^exec\s/ or
/^\w*=/ will also be passed to the shell[1].

Click to expand...

The mention of exec made me wonder about other shell builtins. All of the
following fail on my system, where they are builtin to /bin/sh and not
available as external commands:

perl -e 'system "eval ls /"'
perl -e 'system "command ls /"'
perl -e 'system "set"'
perl -e 'system "times"'
perl -e 'system "read foo"'

All of the above would produce output (or in the last case, consume input) if
they were executed properly. Luckily, none of them are likely to be used for
any reason except to demonstrate the fact that they don't work.

It's easy to force a shell with some no-op metacharacters:

perl -e 'system ":; eval ls /"'

John Kelly · Aug 2, 2010

/* handle the 2>&1 construct at the end */
if (*s == '>' && s[1] == '&' && s[2] == '1'
...
if (!*t && (PerlLIO_dup2(1,2) != -1)) {
s[-2] = '\0';
break;
}
}

Click to expand...

Interesting it says "at the end"

Click to expand...

I wonder if placing it elsewhere in the command will circumvent Perl's
shortcut and invoke a shell as usual.

Click to expand...

At least with my initial implementation, only the position at the end
is special-cased.

Although your "as usual" looks suspicious. The *usual* case is a
shell-less execution. Only if shell metachars appear one is forced to
invoke the shell.

And, BTW, what would be your motivation to wish for a shell?

Most of the time, none. I was just wondering how things work. I like
the shell-less shortcut just fine.

Randal L. Schwartz · Aug 2, 2010

John> It's easy to force a shell with some no-op metacharacters:

John> perl -e 'system ":; eval ls /"'

Even simpler, just append ";". Never changes the meaning, and it forces
/bin/sh in the mix.

Randal L. Schwartz · Aug 2, 2010

Ben> Um, what exactly is wrong with

Ben> perl -e 'system "/bin/sh", -c => "eval ls /"'

1) complexity
2) using => as a bizarro-world comma

Please. Don't. A certain individual started that, and like JAPHs, it's
a meme that deserves to die.

Having said that:

print "Just another Perl hacker,"; # the original

Ted Zlatanov · Aug 2, 2010

On Mon, 02 Aug 2010 04:32:17 -0700 (e-mail address removed) (Randal L. Schwartz) wrote:

RLS> 2) using => as a bizarro-world comma

RLS> Please. Don't. A certain individual started that, and like JAPHs, it's
RLS> a meme that deserves to die.

I find => much better than commas in shell argument lists and many other
places:

1) it quotes the left side--less typing

2) it's much harder to mistype and misread as a period (I've scratched
my head more than once staring at that kind of bug in the source code)

Ted

Peter J. Holzer · Aug 2, 2010

Ben> Um, what exactly is wrong with

Ben> perl -e 'system "/bin/sh", -c => "eval ls /"'

1) complexity

It looks more complex than

system "eval ls /;"

but it really isn't and that ";" is really easy to miss. So that would
have to be something like

system "eval ls /;"; # ; forces shell

and then you can write

system "/bin/sh", "-c", "eval ls /"

too. It even saves two characters ;-).

2) using => as a bizarro-world comma

Please. Don't. A certain individual started that, and like JAPHs, it's
a meme that deserves to die.

I agree about the "=>".

hp

John Kelly · Aug 2, 2010

It looks more complex than

system "eval ls /;"

but it really isn't and that ";" is really easy to miss.

That's why I suggested:

system ":; eval ls /"

It's easy miss ";" at the end, but hard to miss ":;" at the beginning.

So that would have to be something like

system "eval ls /;"; # ; forces shell

and then you can write

system "/bin/sh", "-c", "eval ls /"

too. It even saves two characters ;-).

I like mine best.

Ilya Zakharevich · Aug 2, 2010

Ben> perl -e 'system "/bin/sh", -c => "eval ls /"'

1) complexity
2) using => as a bizarro-world comma

Please. Don't. A certain individual started that, and like JAPHs, it's
a meme that deserves to die.

[I presume (by proximity of JAPH) that this "certain individual" is
you (no, I would not remember myself). I think there is nothing to
be shameful about for this meme.]

When in "random position" in a list, => MAY be confusing. On the
other hand, it may be a tool to attract attention to most important
element(s) of the list.

When used between two "logically connected" parts of the list (as in
$^X, -I => $INC, ...), it is, IMO, very appropriate.

Yours,
Ilya

signal handler is not called while piped open is active	5	Dec 29, 2010
Taint mode piped open problem	4	Jan 26, 2008
How to know if data is piped into my script	3	Apr 15, 2008
FAQ 8.26 Why doesn't open() return an error when a pipe open fails?	0	Apr 28, 2011
Close function blocks forever when reading from piped output	7	Nov 27, 2007
Unbuffered piped command output?	1	Aug 6, 2004
FAQ 8.44 How do I tell the difference between errors from the shell and perl?	0	Feb 20, 2011
Avoiding shell metacharacters in os.popen	10	Sep 29, 2004

piped open and shell metacharacters

John Kelly

Ilya Zakharevich

John Kelly

C.DeRykus

John Kelly

C.DeRykus

John Kelly

Xho Jingleheimerschmidt

C.DeRykus

John Kelly

Ilya Zakharevich

Alan Curry

John Kelly

John Kelly

Randal L. Schwartz

Randal L. Schwartz

Ted Zlatanov

Peter J. Holzer

John Kelly

Ilya Zakharevich

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads