Ruby Newbie Problems with deflate, base64...

Discussion in 'Ruby' started by Pat Patterson, Mar 13, 2007.

  1. I'm implementing a spec that calls for messages to be deflated, then
    base64 encoded, then URL encoded, so they can be passed as parameters to
    an HTTP GET. In PHP, this is achieved with:

    $encodedMsg = urlencode( base64_encode( gzdeflate( $msg ) ) )

    Looking at the docs, the equivalent in Ruby would be something like:

    require "cgi"
    require "base64"
    require "zlib"

    encodedMsg = CGI::escape( Base64.encode64( Zlib::Deflate.deflate( msg ) ) )

    But this gives me completely different output to my (working) PHP model.

    Some googling later, it appears that Zlib::Deflate.deflate prepends and
    appends 'stuff' to the deflated data. Investigating, the PHP

    $deflated = gzdeflate( "Hello world" );
    echo bin2hex( $deflated );

    Gives me

    f348cdc9c95728cf2fca490100

    While the Ruby

    deflated = Zlib::Deflate.deflate( "Hello world" )
    myhex = ""
    1.upto(deflated.length) { |i| myhex << "%02x" % deflated }
    puts myhex

    Shows

    9cf348cdc9c95728cf2fca49010018ab043d00

    (BTW - if anyone knows a more succinct way to hex encode a string in
    Ruby, that would be useful)

    Notice the additional '9c' at the start of the string and '18ab043d00'
    at the end. OK - there seems to be a consistent amount of data prepended
    and appended, so I can slice those off.

    But now Base 64 is acting weird. In PHP:

    $base64Encoded = base64_encode( $deflated );
    echo $base64Encoded;

    Shows

    80jNyclXKM8vykkBAA==

    While, in Ruby

    # Remove extra stuff from deflated string
    deflated = deflated[1,deflated.length-6]
    base64encoded = Base64.encode64( deflated )
    puts base64encoded

    Shows

    nPNIzcnJVyjPL8pJAQ==

    Completely different! Now, if I feed that back through PHP's
    base64_unencode, I get

    9cf348cdc9c95728cf2fca4901

    Notice, again '9c' prepended, but this time the trailing '00' has been
    removed.

    What is going on here? Any ideas??? Am I missing something really
    obvious about string handling in Ruby?

    (BTW - this is on ruby 1.8.4 (2005-12-24) [i486-linux])

    Cheers,

    Pat

    --
    Pat Patterson -
    Federation Architect,
    Sun Microsystems, Inc.
    http://blogs.sun.com/superpat
     
    Pat Patterson, Mar 13, 2007
    #1
    1. Advertising

  2. On Tue, Mar 13, 2007 at 01:21:52PM +0900, Pat Patterson wrote:
    > While the Ruby
    >
    > deflated = Zlib::Deflate.deflate( "Hello world" )
    > myhex = ""
    > 1.upto(deflated.length) { |i| myhex << "%02x" % deflated }
    > puts myhex
    >
    > Shows
    >
    > 9cf348cdc9c95728cf2fca49010018ab043d00
    >
    > (BTW - if anyone knows a more succinct way to hex encode a string in
    > Ruby, that would be useful)


    str = "\001\377"
    puts str.unpack("H*")
     
    Brian Candler, Mar 13, 2007
    #2
    1. Advertising

  3. OK - so now I know what is happening...

    Zlib::Deflate.deflate implements ZLIB compression according to RFC 1950.
    ZLIB defines a 2 byte header containing a variety of flags and a 4 byte
    trailer containing an Adler-32 checksum. Just out of interest, the
    header that I'm seeing translates as 'compression method = deflate,
    windows size = 32k, no preset dictionary, default compression level',
    which makes perfect sense. Deflate compression itself is defined by RFC
    1951.

    So - if you want 'raw' deflated data (which is called for in many
    situations), cutting off the leading 2 and trailing 4 bytes is exactly
    what you need to do.

    Cheers,

    Pat

    Pat Patterson wrote:
    > Thanks, Brian! That revealed an obvious bug in the code I was using to
    > examine the deflated data (should have been 0.upto(deflated.length-1)).
    >
    > So Base64 /is/ working correctly. Deflate prepends 2 bytes (seems to
    > be constant 0x789c for default deflate level) and appends 4 bytes
    > (rather than 1 and 5 as I thought) to the deflated data. When I cut
    > those off, I can get Ruby to work the same as PHP.
    >
    > Still - it would be nice if deflate worked the same as on Java, PHP, ...
    >
    > Cheers,
    >
    > Pat
    >
    > Brian Candler wrote:
    >> On Tue, Mar 13, 2007 at 01:21:52PM +0900, Pat Patterson wrote:
    >>
    >>> While the Ruby
    >>>
    >>> deflated = Zlib::Deflate.deflate( "Hello world" )
    >>> myhex = ""
    >>> 1.upto(deflated.length) { |i| myhex << "%02x" % deflated }
    >>> puts myhex
    >>>
    >>> Shows
    >>>
    >>> 9cf348cdc9c95728cf2fca49010018ab043d00
    >>>
    >>> (BTW - if anyone knows a more succinct way to hex encode a string in
    >>> Ruby, that would be useful)
    >>>

    >>
    >> str = "\001\377"
    >> puts str.unpack("H*")
    >>


    --
    Pat Patterson -
    Federation Architect,
    Sun Microsystems, Inc.
    http://blogs.sun.com/superpat
     
    Pat Patterson, Mar 14, 2007
    #3
  4. Pat Patterson

    eden li Guest

    You can prevent deflate from generating the header by passing in -
    MAX_WBITS to the options for Deflate.new. The following method
    emulates gzdeflate from php:

    def gzdeflate(s)
    Zlib::Deflate.new(nil, -Zlib::MAX_WBITS).deflate(s, Zlib::FINISH)
    end

    puts gzdeflate("Hello World").unpack('H*').first
    # => f348cdc9c95708cf2fca490100

    On Mar 14, 8:16 am, Pat Patterson <> wrote:
    > OK - so now I know what is happening...
    >
    > Zlib::Deflate.deflate implements ZLIB compression according to RFC 1950.
    > ZLIB defines a 2 byte header containing a variety of flags and a 4 byte
    > trailer containing an Adler-32 checksum. Just out of interest, the
    > header that I'm seeing translates as 'compression method = deflate,
    > windows size = 32k, no preset dictionary, default compression level',
    > which makes perfect sense. Deflate compression itself is defined by RFC
    > 1951.
    >
    > So - if you want 'raw' deflated data (which is called for in many
    > situations), cutting off the leading 2 and trailing 4 bytes is exactly
    > what you need to do.
    >
    > Cheers,
    >
    > Pat
    >
    >
    >
    > Pat Patterson wrote:
    > > Thanks, Brian! That revealed an obvious bug in the code I was using to
    > > examine the deflated data (should have been 0.upto(deflated.length-1)).

    >
    > > So Base64 /is/ working correctly. Deflate prepends 2 bytes (seems to
    > > be constant 0x789c for default deflate level) and appends 4 bytes
    > > (rather than 1 and 5 as I thought) to the deflated data. When I cut
    > > those off, I can get Ruby to work the same as PHP.

    >
    > > Still - it would be nice if deflate worked the same as on Java, PHP, ...

    >
    > > Cheers,

    >
    > > Pat

    >
    > > Brian Candler wrote:
    > >> On Tue, Mar 13, 2007 at 01:21:52PM +0900, Pat Patterson wrote:

    >
    > >>> While the Ruby

    >
    > >>> deflated = Zlib::Deflate.deflate( "Hello world" )
    > >>> myhex = ""
    > >>> 1.upto(deflated.length) { |i| myhex << "%02x" % deflated }
    > >>> puts myhex

    >
    > >>> Shows

    >
    > >>> 9cf348cdc9c95728cf2fca49010018ab043d00

    >
    > >>> (BTW - if anyone knows a more succinct way to hex encode a string in
    > >>> Ruby, that would be useful)

    >
    > >> str = "\001\377"
    > >> puts str.unpack("H*")

    >
    > --
    > Pat Patterson -
    > Federation Architect,
    > Sun Microsystems, Inc.http://blogs.sun.com/superpat
     
    eden li, Mar 14, 2007
    #4
  5. Thanks, Eden - that works great and is much cleaner than cutting the
    header and checksum off of the deflated data.

    Cheers,

    Pat

    eden li wrote:
    > You can prevent deflate from generating the header by passing in -
    > MAX_WBITS to the options for Deflate.new. The following method
    > emulates gzdeflate from php:
    >
    > def gzdeflate(s)
    > Zlib::Deflate.new(nil, -Zlib::MAX_WBITS).deflate(s, Zlib::FINISH)
    > end
    >
    > puts gzdeflate("Hello World").unpack('H*').first
    > # => f348cdc9c95708cf2fca490100
    >
    > On Mar 14, 8:16 am, Pat Patterson <> wrote:
    >
    >> OK - so now I know what is happening...
    >>
    >> Zlib::Deflate.deflate implements ZLIB compression according to RFC 1950.
    >> ZLIB defines a 2 byte header containing a variety of flags and a 4 byte
    >> trailer containing an Adler-32 checksum. Just out of interest, the
    >> header that I'm seeing translates as 'compression method = deflate,
    >> windows size = 32k, no preset dictionary, default compression level',
    >> which makes perfect sense. Deflate compression itself is defined by RFC
    >> 1951.
    >>
    >> So - if you want 'raw' deflated data (which is called for in many
    >> situations), cutting off the leading 2 and trailing 4 bytes is exactly
    >> what you need to do.
    >>
    >> Cheers,
    >>
    >> Pat
    >>
    >>
    >>
    >> Pat Patterson wrote:
    >>
    >>> Thanks, Brian! That revealed an obvious bug in the code I was using to
    >>> examine the deflated data (should have been 0.upto(deflated.length-1)).
    >>>
    >>> So Base64 /is/ working correctly. Deflate prepends 2 bytes (seems to
    >>> be constant 0x789c for default deflate level) and appends 4 bytes
    >>> (rather than 1 and 5 as I thought) to the deflated data. When I cut
    >>> those off, I can get Ruby to work the same as PHP.
    >>>
    >>> Still - it would be nice if deflate worked the same as on Java, PHP, ...
    >>>
    >>> Cheers,
    >>>
    >>> Pat
    >>>
    >>> Brian Candler wrote:
    >>>
    >>>> On Tue, Mar 13, 2007 at 01:21:52PM +0900, Pat Patterson wrote:
    >>>>
    >>>>> While the Ruby
    >>>>>
    >>>>> deflated = Zlib::Deflate.deflate( "Hello world" )
    >>>>> myhex = ""
    >>>>> 1.upto(deflated.length) { |i| myhex << "%02x" % deflated }
    >>>>> puts myhex
    >>>>>
    >>>>> Shows
    >>>>>
    >>>>> 9cf348cdc9c95728cf2fca49010018ab043d00
    >>>>>
    >>>>> (BTW - if anyone knows a more succinct way to hex encode a string in
    >>>>> Ruby, that would be useful)
    >>>>>
    >>>> str = "\001\377"
    >>>> puts str.unpack("H*")
    >>>>

    >> --
    >> Pat Patterson -
    >> Federation Architect,
    >> Sun Microsystems, Inc.http://blogs.sun.com/superpat
    >>

    >
    >
    >


    --
    Pat Patterson -
    Federation Architect,
    Sun Microsystems, Inc.
    http://blogs.sun.com/superpat
     
    Pat Patterson, Mar 14, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Bhanu

    zip, gzip and Deflate

    Bhanu, Apr 16, 2007, in forum: Java
    Replies:
    7
    Views:
    793
    Bhanu
    Apr 16, 2007
  2. Sam

    Deflate with urllib2...

    Sam, Sep 9, 2008, in forum: Python
    Replies:
    9
    Views:
    3,208
  3. Aaron Smith

    Zlib::Deflate.deflate problem

    Aaron Smith, Jun 28, 2007, in forum: Ruby
    Replies:
    1
    Views:
    115
    Aaron Smith
    Jun 28, 2007
  4. Dom
    Replies:
    5
    Views:
    328
    Eric Hodel
    Apr 6, 2009
  5. Leif Wessman
    Replies:
    16
    Views:
    1,023
    Leif Wessman
    Dec 8, 2004
Loading...

Share This Page