filehandle to a member of a zip archive

Discussion in 'Perl Misc' started by scottmf, Jun 3, 2006.

  1. scottmf

    scottmf Guest

    I am trying to write a modified file open function that can take an
    ascii text file, a .gz file or a .zip file and create a filehandle for
    reading or writing using the appropriate module. I have a routine
    working for ascii text or .gz files but I cannot figure out how to get
    it working for zip files. The goal is to replace the standard open
    command in several of my scripts with my modified open, but to not have
    to change any other code and be able to read from or write to any file
    type simply based on the filename I use.

    Here is what I have:

    my $test_fh_in;
    my $test_fh_out;
    mod_open($test_fh_in, "test_in.zip");
    mod_open($test_fh_out, ">test_out.zip");
    while(<$test_fh_in>){
    print "$_";
    print $test_fh_out "$_";
    }

    sub mod_open{
    ## Libraries needed to to zipped file reading/writing
    use File::Basename;
    use IO::File ;
    use IO::Zlib ;
    use Archive::Zip;
    my $FH;
    my $file_name = $_[1];
    my $read; ## boolean indicating whether to open the file for
    read or write 1 = read, 0 = write
    ## Need to look into appending as well.....
    if($file_name =~ /^\>/){ ## Check to see if the file is to be
    opened for reading or writing
    $read = 0;
    $file_name =~ s/^\>//; ## drop the read/write identifier from the
    string so it is a valid filename
    }
    else{
    $read = 1;
    }
    ## define a list of zip suffixes
    my @suffixlist = (
    ".gz",
    ".gzip",
    ".zip"
    );
    # check for filename extension - if .gz use the the Zlib, else do
    not...
    my ($file_base, $file_path, $file_type) =
    fileparse($file_name,@suffixlist);

    if($read){ ## if the file is supposed to be opened for reading
    then do so
    if ( $file_type eq "" ) {
    print "The input file $file_name is standard ASCII -
    uncompressed\n" ;
    $FH = IO::File->new($file_name, "r") ;
    }
    elsif($file_type =~ /gz(ip)?/) {
    print "The input file $file_name is compressed using gzip\n" ;
    $FH = IO::Zlib->new($file_name, "rb") ;
    }
    elsif($file_type =~ /zip/){
    print "The input file $file_name is compressed using winzip\n" ;
    my $zp = Archive::Zip->new($file_name); ## open the zip file
    my $numMembers = $zp->numberOfMembers(); ## find out what files
    are in the zip file
    if($numMembers>1){
    die "This routine only supports zip archives with one file\n";
    }
    ## need to get a filehandle for the compressed file
    }
    }
    elsif(!$read){ ## if the file is supposed to be opened for writing
    the do so
    if ( $file_type eq "" ) {
    print "The output file $file_name is standard ASCII -
    uncompressed\n" ;
    $FH = IO::File->new($file_name, "w") ;
    }
    elsif($file_type =~ /gz(ip)?/) {
    print "The output file $file_name is compressed\n" ;
    $FH = IO::Zlib->new($file_name, "wb") ;
    }
    elsif($file_type =~ /zip/){
    print "The output file $in_file is compressed using winzip\n" ;
    my $zp = Archive::Zip->new($file_name); ## create the zip file
    ## need to create a filehandle to a compressed file
    }
    }
    ## make sure the file got opened or created
    if (!defined $FH) {
    die "Cannot open file: $file_name $!\n";
    }
    ## now pass the file handle back to the user for reading or writing
    $_[0] = $FH;
    }
     
    scottmf, Jun 3, 2006
    #1
    1. Advertising

  2. "scottmf" <> wrote in message
    news:...
    >I am trying to write a modified file open function that can take an
    > ascii text file, a .gz file or a .zip file and create a filehandle for
    > reading or writing using the appropriate module. I have a routine
    > working for ascii text or .gz files but I cannot figure out how to get
    > it working for zip files. The goal is to replace the standard open
    > command in several of my scripts with my modified open, but to not have
    > to change any other code and be able to read from or write to any file
    > type simply based on the filename I use.


    For the reading interface in particular, you might wat to have a look at
    IO::Uncompress::AnyUncompress - this will auto-detect a number of
    compression formats, including gzip and zip. It also has a feature to work
    in a passthrough mode if the data isn't compressed. Assuming you have the
    zlib module (IO::Compress::Zlib) installed, this is all you need to open for
    reading all of the file formats you are intrested in

    $FH = IO::Uncompress::AnyUncompress->new($file_name, Transparent =>1)

    If you also have IO::Compress::Bzip2 and/or IO::Compress::Lzop installed,
    you can add bzip2 and lzop compressed files to the list of formats that
    AnyUncompress can handle.

    If you don't want to go down the auto-detection path, this will create a
    filehandle that will read the first element from a zip file.

    $FH = IO::Uncompress::Unzip->new($file_name)

    For writing to zip files, I can't comment on Archive::Zip because I don't
    know it that well, but I can comment on IO::Compress::Zip, because I wrote
    it. This will create a filehandle to allow writing to a zip file

    $FH = IO::Compress::Zip->new($file_name, Name => "whatever")

    Paul
     
    Paul Marquess, Jun 5, 2006
    #2
    1. Advertising

  3. scottmf

    scottmf Guest

    Thanks, I'll look into that when I get a chance. Also, is there any
    way I can make my subroutine take in a barwood operator for the
    filehandle. I would like to be able to easily implement this in old
    code I have by just replacing my open(FH, "filename") with mod_open(FH,
    "filename") instead of having to use $FH and changing all the
    references throughout the script from FH to $FH

    Thanks for the help,
    ~Scott
     
    scottmf, Jun 7, 2006
    #3
  4. scottmf <> wrote:

    > Thanks, I'll look into that when I get a chance.



    Look into what?

    Please quote some context in followups like everybody else does.


    > Also, is there any
    > way I can make my subroutine take in a barwood operator for the
    > filehandle.



    No.

    There is no such thing as a "bareword operator", so I guess
    you meant "bareword filehandle" instead.


    > I would like to be able to easily implement this in old
    > code I have by just replacing my open(FH, "filename") with mod_open(FH,
    > "filename") instead of having to use $FH


    s/open\(/mod_open(/;

    is a lot easier than

    s/open\(/mod_open(\$/;

    ??


    > and changing all the
    > references throughout the script from FH to $FH



    That could be cumbersome.

    (Seems like:
    s/<FH>/<\$FH>/g;
    s/(printf? )FH/$1\$FH/g;
    would come pretty darn close though.

    You can pass a type glob, or a reference to a type glob as
    described in the answer to your Frequently Asked Question:

    perldoc -q filehandle

    How can I make a filehandle local to a subroutine? How do I pass file-
    handles between subroutines? How do I make an array of filehandles?


    --
    Tad McClellan SGML consulting
    Perl programming
    Fort Worth, Texas
     
    Tad McClellan, Jun 7, 2006
    #4
  5. scottmf

    DJ Stunks Guest

    scottmf wrote:
    > Thanks, I'll look into that when I get a chance. Also, is there any
    > way I can make my subroutine take in a barwood operator for the
    > filehandle. I would like to be able to easily implement this in old
    > code I have by just replacing my open(FH, "filename") with mod_open(FH,
    > "filename") instead of having to use $FH and changing all the
    > references throughout the script from FH to $FH
    >
    > Thanks for the help,
    > ~Scott


    perl -pi~ -e " s'FH'$FH' " source.pl

    -jp
     
    DJ Stunks, Jun 7, 2006
    #5
  6. "DJ Stunks" <> wrote in message
    news:...
    >
    > scottmf wrote:
    >> Thanks, I'll look into that when I get a chance. Also, is there any
    >> way I can make my subroutine take in a barwood operator for the
    >> filehandle. I would like to be able to easily implement this in old
    >> code I have by just replacing my open(FH, "filename") with mod_open(FH,
    >> "filename") instead of having to use $FH and changing all the
    >> references throughout the script from FH to $FH
    >>
    >> Thanks for the help,
    >> ~Scott

    >
    > perl -pi~ -e " s'FH'$FH' " source.pl


    You could try something like this

    mod_open(FH, "filename");

    sub mod_open{
    ...
    $FH = IO::File->new($file_name, "r") ;
    ...
    ## now pass the file handle back to the user for reading or writing
    my $href = \*{ $_[0] };
    $$href = $FH;
    }
     
    Paul Marquess, Jun 7, 2006
    #6
  7. scottmf

    scottmf Guest

    > s/open\(/mod_open(/;
    >
    > is a lot easier than
    >
    > s/open\(/mod_open(\$/;
    >
    > (Seems like:
    > s/<FH>/<\$FH>/g;
    > s/(printf? )FH/$1\$FH/g;
    > would come pretty darn close though.


    While I realize I could go through all of the scripts I need to update
    and change every filehandle reference, it seems like there *should* be
    a way to have my modified open function simply take a bareword
    filehandle the same as open() does (and still "use strict;") After
    reading perdoc -q filehandle I have had some success passing the
    type_glob by reference, but I cannot get this to work with IO::File or
    IO::Zlib; see the code below for an example. I can do exactly what I
    want with the open() function, but I cannot get it to work with the
    IO::File->new or IO::Zlib->new functions.

    Thanks,
    ~Scott

    #!/usr/local/bin/perl
    #
    use strict;
    use warnings;
    use IO::File;
    use IO::Zlib;

    new_open(\*FH_IN, "test.txt") || die "Can't open the file\n"; ## Pass
    type glob by reference

    while(<FH_IN>){ ## read the file without any modification to the
    filehandle
    print "$_";
    }
    close(FH_IN);

    sub new_open{
    my $FH = $_[0];
    my $file_name = $_[1];
    #$FH = IO::File->new($file_name, "r"); ## This also does not work
    #$FH = IO::Zlib->new($file_name, "rb") ; ## this does not work if I
    have a .gz file
    open($FH, "$file_name") || die "Can't open $!\n"; ## This method
    works
    }
     
    scottmf, Jun 8, 2006
    #7
  8. scottmf

    scottmf Guest

    Okay, I think I finally got things working the way I want (See code
    below). Thanks for all the help. If anyone has any ideas on how to
    improve this function it would still be greatly appreciated.

    Thanks,
    ~Scott


    #!/usr/local/bin/perl
    #
    use strict;
    use warnings;

    new_open(\*FH_IN, "test.txt.gz") || die "Can't open the file\n"; ##
    Pass typeglob by reference

    while(<FH_IN>){ ## No change to bareword filehandle throughout code
    print "$_";
    }
    close(FH_IN);

    sub new_open{
    my $FH = $_[0];
    my $file_name = $_[1];
    if($file_name =~ /\.gz$/){ ## check to see if it is a gzip file or
    not
    tie *$FH, 'IO::Zlib', "$file_name", "rb";
    }
    else{
    tie *$FH, 'IO::File', "$file_name", "r";
    }
    }
     
    scottmf, Jun 8, 2006
    #8
  9. scottmf

    Ben Morrow Guest

    Quoth "scottmf" <>:
    > > s/open\(/mod_open(/;
    > >
    > > is a lot easier than
    > >
    > > s/open\(/mod_open(\$/;
    > >
    > > (Seems like:
    > > s/<FH>/<\$FH>/g;
    > > s/(printf? )FH/$1\$FH/g;
    > > would come pretty darn close though.

    >
    > While I realize I could go through all of the scripts I need to update
    > and change every filehandle reference, it seems like there *should* be
    > a way to have my modified open function simply take a bareword
    > filehandle the same as open() does (and still "use strict;")


    Use prototypes. This is what they are for.

    ~% perl -le'print prototype "CORE::eek:pen"'
    *;$@

    The fact the second arg is optional is probably not something you want
    to emulate, so

    use strict;
    use Symbol;

    sub my_open (*$;@) {

    my $FH = defined $_[0] ?
    Symbol::qualify_to_ref $_[0], caller :
    ($_[0] = Symbol::gensym);
    shift;

    my $op = shift;

    return open $FH, $op, @_;
    }

    will emulate open.

    However, a much better idea if you're writing your own function is to
    write it to return an open FH, rather than opening onto one passed in.
    It makes the implementation much simpler, and, as you can see, makes
    replacing the function with another later much easier.

    Ben

    --
    I touch the fire and it freezes me, []
    I look into it and it's black.
    Why can't I feel? My skin should crack and peel---
    I want the fire back... Buffy, 'Once More With Feeling'
     
    Ben Morrow, Jun 9, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Graham Wood
    Replies:
    3
    Views:
    548
    A. Sinan Unur
    Jan 11, 2004
  2. Twig
    Replies:
    1
    Views:
    131
    A. Sinan Unur
    Jan 27, 2006
  3. MoshiachNow
    Replies:
    2
    Views:
    276
    Ilya Zakharevich
    Oct 4, 2006
  4. MoshiachNow

    Archive::Zip - zip file has "invalid" format

    MoshiachNow, Oct 5, 2006, in forum: Perl Misc
    Replies:
    1
    Views:
    169
  5. Bo Yang
    Replies:
    9
    Views:
    305
    -berlin.de
    Nov 20, 2006
Loading...

Share This Page