Zlib::GzipReader doesn't work as expected

Discussion in 'Ruby' started by Thomas Wolf, Apr 25, 2012.

  1. Thomas Wolf

    Thomas Wolf Guest

    Hi,
    given 2 files:
    cat 5lines.txt
    5 lines
    5 lines
    5 lines
    5 lines
    5 lines

    cat more5lines.txt
    More 5 lines
    More 5 lines
    More 5 lines
    More 5 lines
    More 5 lines

    These files are "gzip"ed as follows:
    gzip < 5lines.txt > foo.gz
    gzip < more5lines.txt >> foo.gz

    zcat foo.gz:
    5 lines
    5 lines
    5 lines
    5 lines
    5 lines
    More 5 lines
    More 5 lines
    More 5 lines
    More 5 lines
    More 5 lines

    This ruby code only reads the first 5 lines:
    #!/usr/bin/ruby
    require "zlib"
    filename = ARGV[0]

    Zlib::GzipReader.open(filename) {|gz|
    print gz.read
    }

    ../test.rb foo.gz
    5 lines
    5 lines
    5 lines
    5 lines
    5 lines

    How do I force Zlib::GzipReader do read the whole file?

    ruby versions: 1.8.7 and 1.9.0

    Thanks and regards,
    Thomas Wolf
     
    Thomas Wolf, Apr 25, 2012
    #1
    1. Advertising

  2. On 04/25/2012 10:57 AM, Thomas Wolf wrote:
    > Hi,
    > given 2 files:
    > cat 5lines.txt
    > 5 lines
    > 5 lines
    > 5 lines
    > 5 lines
    > 5 lines
    >
    > cat more5lines.txt
    > More 5 lines
    > More 5 lines
    > More 5 lines
    > More 5 lines
    > More 5 lines
    >
    > These files are "gzip"ed as follows:
    > gzip < 5lines.txt > foo.gz
    > gzip < more5lines.txt >> foo.gz
    >
    > zcat foo.gz:
    > 5 lines
    > 5 lines
    > 5 lines
    > 5 lines
    > 5 lines
    > More 5 lines
    > More 5 lines
    > More 5 lines
    > More 5 lines
    > More 5 lines
    >
    > This ruby code only reads the first 5 lines:
    > #!/usr/bin/ruby
    > require "zlib"
    > filename = ARGV[0]
    >
    > Zlib::GzipReader.open(filename) {|gz|
    > print gz.read
    > }
    >
    > ./test.rb foo.gz
    > 5 lines
    > 5 lines
    > 5 lines
    > 5 lines
    > 5 lines
    >
    > How do I force Zlib::GzipReader do read the whole file?


    That's a fairly common limitation of GZip libs (Java's standard lib also
    has this limitation, or at least hat last time I checked).

    You might get away with wrapping the GzipReader around an open IO object
    and wrapping another GzipReader when the first finishes.

    Kind regards

    robert
     
    Robert Klemme, Apr 25, 2012
    #2
    1. Advertising

  3. * Thomas Wolf <> (10:57) schrieb:

    > These files are "gzip"ed as follows:
    > gzip < 5lines.txt > foo.gz
    > gzip < more5lines.txt >> foo.gz


    So you have two streams of gzipped data in foo.gz.

    And the ruby library reads only the first one.

    > How do I force Zlib::GzipReader do read the whole file?


    I don't know, read the source.

    mfg, simon .... l
     
    Simon Krahnke, Apr 25, 2012
    #3
  4. * Robert Klemme <> (21:03) schrieb:

    > You might get away with wrapping the GzipReader around an open IO object
    > and wrapping another GzipReader when the first finishes.


    Like this:

    ,----[ gz.rb ]
    | #!/usr/bin/env ruby
    |
    | require 'zlib'
    | require 'pp'
    |
    | filename = *ARGV
    |
    | File.open filename do | f |
    | gz1 = Zlib::GzipReader.new(f)
    | pp gz1.read
    | pp Zlib::GzipReader.new(f).read
    | end
    `----

    Doesn't work.

    mfg, simon .... l
     
    Simon Krahnke, Apr 25, 2012
    #4
  5. Thomas Wolf

    Thomas Wolf Guest

    Am 25.04.2012 21:03, schrieb Robert Klemme:
    >> How do I force Zlib::GzipReader do read the whole file?

    >
    > That's a fairly common limitation of GZip libs (Java's standard lib also
    > has this limitation, or at least hat last time I checked).
    >
    > You might get away with wrapping the GzipReader around an open IO object
    > and wrapping another GzipReader when the first finishes.


    Thank you.

    I found the following thread:
    http://www.velocityreviews.com/foru...iple-compressed-blobs-in-a-single-stream.html

    and that code works with ruby 1.9.3p0:

    require 'stringio'
    require 'zlib'

    def inflate(filename)
    File.open(filename) do |file|
    zio = file
    loop do
    io = Zlib::GzipReader.new zio
    puts io.read
    unused = io.unused
    io.finish
    break if unused.nil?
    zio.pos -= unused.length
    end
    end
    end

    inflate "foo.gz"

    Regards,
    Thomas
     
    Thomas Wolf, Apr 26, 2012
    #5
  6. * Thomas Wolf <> (11:54) schrieb:

    > require 'stringio'


    This is unneeded.

    >require 'zlib'
    >
    >def inflate(filename)
    > File.open(filename) do |file|
    > zio = file


    You could just use | zio | instead of |file| and get rid of the
    assignment.

    > loop do
    > io = Zlib::GzipReader.new zio
    > puts io.read


    puts here will put another "\n" at the end of the output, use print
    instead.

    > unused = io.unused
    > io.finish
    > break if unused.nil?
    > zio.pos -= unused.length
    > end
    > end
    >end
    >
    >inflate "foo.gz"


    Note that as said in the thread this works only for files and other
    seekable sources.

    So "(seq 1 5 | gzip; seq 6 10 | gzip) | yourscript.rb" won't work.

    mfg, simon .... hth
     
    Simon Krahnke, Apr 26, 2012
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matthew Brett
    Replies:
    4
    Views:
    1,187
    Matthew Brett
    May 9, 2010
  2. David G. Andersen

    Speed gap between zcat and zlib's GzipReader

    David G. Andersen, Oct 19, 2004, in forum: Ruby
    Replies:
    3
    Views:
    517
    Yukihiro Matsumoto
    Oct 26, 2004
  3. Replies:
    2
    Views:
    532
  4. J-H Johansen

    Info regarding Zlib::GzipReader

    J-H Johansen, Jun 15, 2007, in forum: Ruby
    Replies:
    0
    Views:
    142
    J-H Johansen
    Jun 15, 2007
  5. Jos Backus
    Replies:
    10
    Views:
    524
    Jeremy Bopp
    Feb 4, 2011
Loading...

Share This Page