bunzip2 when exec()-ed from perl script outputs garbage data.

Discussion in 'Perl Misc' started by Thomas Covello, Feb 2, 2004.

  1. Hello,
    When the following perl script is executed:

    #!/usr/bin/env perl
    use strict;
    use diagnostics;
    use warnings;

    # Header bytes for different zip formats
    my $GZIP_HEADER = "\x1f\x8b\x08\x08";
    my $BZIP_HEADER = "BZh9"; # 42 5a 68 39

    my $header_bytes;

    read STDIN, $header_bytes, 4 or die "Trouble reading input: $!";
    if ($header_bytes eq $GZIP_HEADER) {
    exec "gunzip -f";
    die "gunzip doesn't exist or can't be accessed: $!";
    } elsif ($header_bytes eq $BZIP_HEADER) {
    exec "bunzip2 -f";
    die "bunzip2 doesn't exist or can't be accessed: $!";
    } else { die "Not a proper zip file or unsupported zip format." }

    I get output like this:
    BZh91AY (blah blah blah.... I can't copy it because it contains NULs)

    I've used print statements to prove that it executed bzip2 and found the
    correct magic number.
    Thomas Covello, Feb 2, 2004
    #1
    1. Advertising

  2. In article <>,
    Thomas Covello <> wrote:
    :When the following perl script is executed:

    :read STDIN, $header_bytes, 4 or die "Trouble reading input: $!";

    : exec "bunzip2 -f";

    :I get output like this:
    :BZh91AY (blah blah blah.... I can't copy it because it contains NULs)

    You read 4 bytes from STDIN and then you exec off bunzip2 without
    having restored those 4 bytes.

    Because you are operating from stdin, you can't be sure that you
    are working with a file instead of a pipe, so you can't just
    seek() back to the beginning. And you can't use IO::Handle::ungetc
    because you can't be sure that will handle more than 1 byte
    of pushback.

    If you don't want to make a temporary file and have bunzip2 read
    that, then you are going to need a pipe that you can stuff the
    four bytes into and then copy the rest of input to. And you
    want the output of that pipe to be sent to STDOUT. That
    implies you are going to have to use something like IPC::Open2
    (q.v.). There's probably some complications if you are using
    Windows...
    --
    I was very young in those days, but I was also rather dim.
    -- Christopher Priest
    Walter Roberson, Feb 2, 2004
    #2
    1. Advertising

  3. On Mon, 02 Feb 2004 01:03:38 +0000, Walter Roberson wrote:

    > You read 4 bytes from STDIN and then you exec off bunzip2 without
    > having restored those 4 bytes.


    The -f option of bunzip2, according to the manpage, is supposed to
    permit operation without the magic number.
    Thomas Covello, Feb 2, 2004
    #3
  4. In article <>,
    Thomas Covello <> wrote:
    :On Mon, 02 Feb 2004 01:03:38 +0000, Walter Roberson wrote:

    :> You read 4 bytes from STDIN and then you exec off bunzip2 without
    :> having restored those 4 bytes.

    :The -f option of bunzip2, according to the manpage, is supposed to
    :permit operation without the magic number.

    On my system, bunzip2 -? says

    -f --force overwrite existing output files
    --
    Preposterous!! Where would all the calculators go?!
    Walter Roberson, Feb 2, 2004
    #4
  5. Thomas Covello

    Ben Morrow Guest

    -cnrc.gc.ca (Walter Roberson) wrote:
    > In article <>,
    > Thomas Covello <> wrote:
    > :On Mon, 02 Feb 2004 01:03:38 +0000, Walter Roberson wrote:
    >
    > :The -f option of bunzip2, according to the manpage, is supposed to
    > :permit operation without the magic number.
    >
    > On my system, bunzip2 -? says
    >
    > -f --force overwrite existing output files


    but man bzip2 says

    | -f --force
    |
    | Force overwrite of output files.
    <snip>
    | bzip2 normally declines to decompress files which don't have the
    | correct magic header bytes. If forced (-f), however, it will
    | pass such files through unmodified.
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    OP: it won't attempt to bunzip them without the magic number. You
    probably want to open a pipe to bunzip2 and print the magic number to
    it, and then use IO::SendFile to copy the rest. If you can't use
    IO::SendFile you'll have to do it by hand, something like

    $\ = undef;
    $/ = \4096;
    print BZIP2 $_ while <STDIN>;
    close BZIP2 or die "close of bunzip2 failed: $!";
    exit ($? >> 8);

    Ben

    --
    Heracles: Vulture! Here's a titbit for you / A few dried molecules of the gall
    From the liver of a friend of yours. / Excuse the arrow but I have no spoon.
    (Ted Hughes, [ Heracles shoots Vulture with arrow. Vulture bursts into ]
    /Alcestis/) [ flame, and falls out of sight. ]
    Ben Morrow, Feb 2, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. dpackwood
    Replies:
    3
    Views:
    1,785
  2. Thomas Covello
    Replies:
    2
    Views:
    485
    Joe Smith
    Feb 2, 2004
  3. Hal Vaughan
    Replies:
    11
    Views:
    1,105
    Gordon Beaton
    May 22, 2006
  4. Replies:
    1
    Views:
    437
    mrstephengross
    Jul 25, 2005
  5. PhEaSaNt PLuCKeR

    perl script to pass data to another perl script?

    PhEaSaNt PLuCKeR, Oct 30, 2005, in forum: Perl Misc
    Replies:
    1
    Views:
    148
Loading...

Share This Page