piped open and shell metacharacters

J

John Kelly

The Camel book, 16.3.1. Anonymous Pipes says:
Perl uses your default system shell (/bin/sh on Unix) whenever a pipe
command contains special characters that the shell cares about. If
you're only starting one command, and you don't need--or don't want--to
use the shell, you can use the multi-argument form of a piped open ...
... But then you don't get I/O redirection, wildcard expansion, or
multistage pipes, since Perl relies on your shell to do those.

and 29.2.104. open says
Any pipe command containing shell metacharacters such as wildcards or
I/O redirections is passed to your system's canonical shell (/bin/sh on
Unix), so those shell-specific constructs can be processed first. If no
metacharacters are found, Perl launches the new process itself without
calling the shell.


I have a script that traps the standard output of any command passed in
as args to the script. My piped open uses 2>&1 to grab stderr as well
as stdout. I thought > was a shell metacharacter so I expected to see
/bin/sh between my script and the trapped command when doing ps ax. But
in many cases, Perl runs the trapped command directly, without needing
/bin/sh.

You can see that by running the script like this:

../myscript sleep 10

and then doing a ps ax before the sleep ends. Is the book wrong, or I
am I missing something? Here is the script:






#!/usr/bin/perl

# Define author
# John Kelly, July 28, 2010

# Define copyright
# Copyright John Kelly, 2010. All rights reserved.

# Define license
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this work except in compliance with the License.
# You may obtain a copy of the License at:
# http://www.apache.org/licenses/LICENSE-2.0

# Define symbols and (words)
# OT ........... Output Trap
# bas0 ......... basename of $0
# binx ......... binary executable
# tt ........... temporary time




use strict;
use FileHandle;
use File::Basename;
use POSIX qw (strftime);

STDOUT->autoflush (1);
STDERR->autoflush (1);

my $bas0 = basename ($0);
my $binx;

unless ($binx = shift @ARGV) {
print "Usage: ", $bas0, " binary.executable [args]\n";
exit 1;
}

my $basx = basename ($binx);

if (!defined (open OT, "$binx @ARGV 2>&1 |")) {
printf "%s -> %s: failure starting %s: $!\n", &tt, $bas0, $binx;
exit 1;
}
while (<OT>) {
/^\s*$/ && next;
printf "%s -> %s: ", &tt, $basx;
print $_;
}
if (!(close OT) && $!) {
printf "%s -> %s: failure closing OT: $!\n", &tt, $bas0;
} else {
if ($? & 127) {
printf "%s -> %s: %s signal %d, %s coredump\n", &tt, $bas0, $basx,
($? & 127), ($? & 128) ? 'with' : 'without';
} else {
printf "%s -> %s: %s exit value %d\n", &tt, $bas0, $basx, $? >> 8;
}
}

sub tt {
strftime "%a %b %e %H:%M:%S %Z %Y", localtime;
}
 
I

Ilya Zakharevich

as args to the script. My piped open uses 2>&1 to grab stderr as well
as stdout. I thought > was a shell metacharacter so I expected to see
/bin/sh between my script and the trapped command when doing ps ax.

At least on Unix and OS/2, I implemented shell-less "2>&1". (For a
decade already I'm also sitting on code which does shell-less "FOO"
and 'bar', but have no time to run a test suite - so I never send it
to p5p.)

[As my other post this week shows, after I fixed the bug (in fact 2)
in the OS/2 branch I suspected for several years, my "stress test" now
passes on OS/2. On other architectures, Perl's pipe open() (and maybe
system() too???) is/are seriously buggy...]

Hope this helps,
Ilya
 
J

John Kelly

on Unix and OS/2, I implemented shell-less "2>&1"

That's what I wondered. Makes sense to avoid an extra shell pid when
2>&1 is the only shell metacharacter present.

Works for me.
 
C

C.DeRykus

The Camel book, 16.3.1. Anonymous Pipes says:
Perl uses your default system shell (/bin/sh on Unix) whenever a pipe
command contains special characters that the shell cares about. If
you're only starting one command, and you don't need--or don't want--to
use the shell, you can use the multi-argument form of a piped open ...
... But then you don't get I/O redirection, wildcard expansion, or
multistage pipes, since Perl relies on your shell to do those.

and 29.2.104. open says
Any pipe command containing shell metacharacters such as wildcards or
I/O redirections is passed to your system's canonical shell (/bin/sh on
Unix), so those shell-specific constructs can be processed first. If no
metacharacters are found, Perl launches the new process itself without
calling the shell.

I have a script that traps the standard output of any command passed in
as args to the script.   My piped open uses 2>&1 to grab stderr as well
as stdout.  I thought > was a shell metacharacter so I expected to see
/bin/sh between my script and the trapped command when doing ps ax.  But
in many cases, Perl runs the trapped command directly, without needing
/bin/sh.

You can see that by running the script like this:

./myscript sleep 10

and then doing a ps ax before the sleep ends.  Is the book wrong, or I
am I missing something?  Here is the script:

#!/usr/bin/perl

#   Define author
#       John Kelly, July 28, 2010

#   Define copyright
#       Copyright John Kelly, 2010. All rights reserved.

#   Define license
#       Licensed under the Apache License, Version 2.0 (the "License");
#       you may not use this work except in compliance with the License.
#       You may obtain a copy of the License at:
#      http://www.apache.org/licenses/LICENSE-2.0

#   Define symbols and (words)
#       OT ...........  Output Trap
#       bas0 .........  basename of $0
#       binx .........  binary executable
#       tt ...........  temporary time

use strict;
use FileHandle;
use File::Basename;
use POSIX qw (strftime);

STDOUT->autoflush (1);
STDERR->autoflush (1);

my $bas0 = basename ($0);
my $binx;

unless ($binx = shift @ARGV) {
    print "Usage: ", $bas0, " binary.executable [args]\n";
    exit 1;

}

my $basx = basename ($binx);
my $kid; # edit #1
if (!defined ($kid = open OT, "$binx @ARGV 2>&1 |")) { # edit # 2
    printf "%s -> %s: failure starting %s: $!\n", &tt, $bas0, $binx;
    exit 1;}
print "parent shell=",getppid()," perl process=$$", # edit # 3
" perl kid=$kid\n";
while (<OT>) {
    /^\s*$/ && next;
    printf "%s -> %s: ", &tt, $basx;
    print $_;}

if (!(close OT) && $!) {
    printf "%s -> %s: failure closing OT: $!\n", &tt, $bas0;} else {

    if ($? & 127) {
        printf "%s -> %s: %s signal %d, %s coredump\n", &tt, $bas0, $basx,
          ($? & 127), ($? & 128) ? 'with' : 'without';
    } else {
        printf "%s -> %s: %s exit value %d\n", &tt, $bas0, $basx,$? >> 8;
    }

}

sub tt {
    strftime "%a %b %e %H:%M:%S %Z %Y", localtime;}


On FreeBSD (with small edits above), I don't see that
happening.

$ myscript.pl sleep 60
parent shell=71889 perl process=75147 perl kid=75148

$ ps -ax
PID TT STAT TIME COMMAND
71889 2 SNs 0:00.01 -bash (bash)
75147 2 SN+ 0:00.02 /usr/bin/perl ./shell.pl sleep 60
75148 2 SN+ 0:00.00 sleep 60
.....

I believe execl in the perl kid launches a shell
which then gets overlaid by sleep(). From doio.c:

PerlProc_execl(PL_sh_path, "sh", "-c", cmd, (char *)NULL);
 
J

John Kelly

On FreeBSD (with small edits above), I don't see that
happening.

$ myscript.pl sleep 60
parent shell=71889 perl process=75147 perl kid=75148

That's what I'm saying; the kid pid is only 1 greater than the perl pid,
which means there was never a shell pid launched.

$ ps -ax
PID TT STAT TIME COMMAND
71889 2 SNs 0:00.01 -bash (bash)
75147 2 SN+ 0:00.02 /usr/bin/perl ./shell.pl sleep 60
75148 2 SN+ 0:00.00 sleep 60
....

I believe execl in the perl kid launches a shell
which then gets overlaid by sleep().

I don't think so.

You could overlay the shell by prefixing the command with the "exec"
shell builtin, but I didn't do that.

Seems like perl is recognizing 2>&1 as a limited special case, copying
fd1 to fd2 , then exec'ing the binary directly, without using a shell.
 
C

C.DeRykus

JK> That's what I'm saying; the kid pid is only 1 greater than
JK> the perl pid, which means there was never a shell pid
JK> launched.JK> I don't think so.
JK>
JK> You could overlay the shell by prefixing the command
JK> with the "exec" shell builtin, but I didn't do that.
JK>
JK> Seems like perl is recognizing 2>&1 as a limited special
JK> case, copying
JK> fd1 to fd2 , then exec'ing the binary directly, without
JK> sing a shell.

Maybe I missed something but I anything supporting your
supposition in the source or docs:

perlfaq8:
If the second argument to a piped open() contains shell
metacharacters, perl fork()s, then exec()s a shell to
decode the metacharacters and eventually run the desired
program...

perlopentut:
But if the command contains special shell characters, such
as ">" or "*", called 'metacharacters', Perl does not execute
the command directly. Instead, Perl runs the shell, which then
tries to run the command.
 
J

John Kelly

JK> Seems like perl is recognizing 2>&1 as a limited special
JK> case, copying
JK> fd1 to fd2 , then exec'ing the binary directly, without
JK> sing a shell.
Maybe I missed something but I anything supporting your
supposition in the source or docs:

perlfaq8:
If the second argument to a piped open() contains shell
metacharacters, perl fork()s, then exec()s a shell to
decode the metacharacters and eventually run the desired
program...

perlopentut:
But if the command contains special shell characters, such
as ">" or "*", called 'metacharacters', Perl does not execute
the command directly. Instead, Perl runs the shell, which then
tries to run the command.

Seems the docs are incomplete.
 
X

Xho Jingleheimerschmidt

C.DeRykus said:
JK> That's what I'm saying; the kid pid is only 1 greater than
JK> the perl pid, which means there was never a shell pid
JK> launched.
JK> I don't think so.
JK>
JK> You could overlay the shell by prefixing the command
JK> with the "exec" shell builtin, but I didn't do that.
JK>
JK> Seems like perl is recognizing 2>&1 as a limited special
JK> case, copying
JK> fd1 to fd2 , then exec'ing the binary directly, without
JK> sing a shell.

Maybe I missed something but I anything supporting your
supposition in the source or docs:

I don't see it in the docs either. But I'm quite sure it is in the
source, as the behavior indubitably exists, as verified with strace, and
I doubt perl has a mind of its own.
 
C

C.DeRykus

I don't see it in the docs either.  But I'm quite sure it is in the
source, as the behavior indubitably exists, as verified with strace, and
I doubt perl has a mind of its own.

Yes, IIUC, it looks like there is a dup and the shell's
bypassed:

From doio.c:

/* handle the 2>&1 construct at the end */
if (*s == '>' && s[1] == '&' && s[2] == '1'
...
if (!*t && (PerlLIO_dup2(1,2) != -1)) {
s[-2] = '\0';
break;
}
}
 
J

John Kelly

Yes, IIUC, it looks like there is a dup and the shell's
bypassed:

From doio.c:

/* handle the 2>&1 construct at the end */
if (*s == '>' && s[1] == '&' && s[2] == '1'
...
if (!*t && (PerlLIO_dup2(1,2) != -1)) {
s[-2] = '\0';
break;
}
}

Interesting it says "at the end"

I wonder if placing it elsewhere in the command will circumvent Perl's
shortcut and invoke a shell as usual.
 
I

Ilya Zakharevich

/* handle the 2>&1 construct at the end */
if (*s == '>' && s[1] == '&' && s[2] == '1'
...
if (!*t && (PerlLIO_dup2(1,2) != -1)) {
s[-2] = '\0';
break;
}
}

Interesting it says "at the end"
I wonder if placing it elsewhere in the command will circumvent Perl's
shortcut and invoke a shell as usual.

At least with my initial implementation, only the position at the end
is special-cased.

Although your "as usual" looks suspicious. The *usual* case is a
shell-less execution. Only if shell metachars appear one is forced to
invoke the shell.

And, BTW, what would be your motivation to wish for a shell?

Yours,
Ilya
 
A

Alan Curry

It may be worth noting that a command matching /^\.\s/, /^exec\s/ or
/^\w*=/ will also be passed to the shell[1].

The mention of exec made me wonder about other shell builtins. All of the
following fail on my system, where they are builtin to /bin/sh and not
available as external commands:

perl -e 'system "eval ls /"'
perl -e 'system "command ls /"'
perl -e 'system "set"'
perl -e 'system "times"'
perl -e 'system "read foo"'

All of the above would produce output (or in the last case, consume input) if
they were executed properly. Luckily, none of them are likely to be used for
any reason except to demonstrate the fact that they don't work.
 
J

John Kelly

It may be worth noting that a command matching /^\.\s/, /^exec\s/ or
/^\w*=/ will also be passed to the shell[1].

The mention of exec made me wonder about other shell builtins. All of the
following fail on my system, where they are builtin to /bin/sh and not
available as external commands:

perl -e 'system "eval ls /"'
perl -e 'system "command ls /"'
perl -e 'system "set"'
perl -e 'system "times"'
perl -e 'system "read foo"'

All of the above would produce output (or in the last case, consume input) if
they were executed properly. Luckily, none of them are likely to be used for
any reason except to demonstrate the fact that they don't work.

It's easy to force a shell with some no-op metacharacters:

perl -e 'system ":; eval ls /"'
 
J

John Kelly

/* handle the 2>&1 construct at the end */
if (*s == '>' && s[1] == '&' && s[2] == '1'
...
if (!*t && (PerlLIO_dup2(1,2) != -1)) {
s[-2] = '\0';
break;
}
}

Interesting it says "at the end"
I wonder if placing it elsewhere in the command will circumvent Perl's
shortcut and invoke a shell as usual.

At least with my initial implementation, only the position at the end
is special-cased.

Although your "as usual" looks suspicious. The *usual* case is a
shell-less execution. Only if shell metachars appear one is forced to
invoke the shell.

And, BTW, what would be your motivation to wish for a shell?

Most of the time, none. I was just wondering how things work. I like
the shell-less shortcut just fine.
 
R

Randal L. Schwartz

John> It's easy to force a shell with some no-op metacharacters:

John> perl -e 'system ":; eval ls /"'

Even simpler, just append ";". Never changes the meaning, and it forces
/bin/sh in the mix.
 
R

Randal L. Schwartz

Ben> Um, what exactly is wrong with

Ben> perl -e 'system "/bin/sh", -c => "eval ls /"'

1) complexity
2) using => as a bizarro-world comma

Please. Don't. A certain individual started that, and like JAPHs, it's
a meme that deserves to die.

Having said that:

print "Just another Perl hacker,"; # the original
 
T

Ted Zlatanov

On Mon, 02 Aug 2010 04:32:17 -0700 (e-mail address removed) (Randal L. Schwartz) wrote:

RLS> 2) using => as a bizarro-world comma

RLS> Please. Don't. A certain individual started that, and like JAPHs, it's
RLS> a meme that deserves to die.

I find => much better than commas in shell argument lists and many other
places:

1) it quotes the left side--less typing

2) it's much harder to mistype and misread as a period (I've scratched
my head more than once staring at that kind of bug in the source code)

Ted
 
P

Peter J. Holzer

Ben> Um, what exactly is wrong with

Ben> perl -e 'system "/bin/sh", -c => "eval ls /"'

1) complexity

It looks more complex than

system "eval ls /;"

but it really isn't and that ";" is really easy to miss. So that would
have to be something like

system "eval ls /;"; # ; forces shell

and then you can write

system "/bin/sh", "-c", "eval ls /"

too. It even saves two characters ;-).

2) using => as a bizarro-world comma

Please. Don't. A certain individual started that, and like JAPHs, it's
a meme that deserves to die.

I agree about the "=>".

hp
 
J

John Kelly

It looks more complex than

system "eval ls /;"

but it really isn't and that ";" is really easy to miss.

That's why I suggested:

system ":; eval ls /"

It's easy miss ";" at the end, but hard to miss ":;" at the beginning.

So that would have to be something like

system "eval ls /;"; # ; forces shell

and then you can write

system "/bin/sh", "-c", "eval ls /"

too. It even saves two characters ;-).

I like mine best. :)
 
I

Ilya Zakharevich

Ben> perl -e 'system "/bin/sh", -c => "eval ls /"'
1) complexity
2) using => as a bizarro-world comma
Please. Don't. A certain individual started that, and like JAPHs, it's
a meme that deserves to die.

[I presume (by proximity of JAPH) that this "certain individual" is
you (no, I would not remember myself). I think there is nothing to
be shameful about for this meme.]

When in "random position" in a list, => MAY be confusing. On the
other hand, it may be a tool to attract attention to most important
element(s) of the list.

When used between two "logically connected" parts of the list (as in
$^X, -I => $INC, ...), it is, IMO, very appropriate.

Yours,
Ilya
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top