Gzip each chunk separately

Discussion in 'Java' started by Lior Knaany, Jan 2, 2006.

  1. Lior Knaany

    Lior Knaany Guest

    Hi all,

    I need some help understanding chunked & gzipped data in HTTP/1.1
    protocol.
    Using headers like "Content-Encoding" vs. "Transfer-Encoding".
    (doing this in order to develop a web server filter)

    I noticed that when the server sends a Gzip content in chunks the
    response headers will be as so :

    "Content-Encoding: gzip
    Transfer-Encoding: Chunked"

    The browser waits for all the chunks, concates them together & runs
    GUnZip on them to get the content.


    But why Gzip the entire data before sending ? Is there a way that the
    server can Gzip the chunk & then send it (doing the same for all the
    chunks)?
    Meaning the Gzip will not be on the entire content all together, but
    for each chunk.
    This way the browser could read one chunk, GUnZip it, display the
    result & continue to the next chunk.

    If there is a way, what should the response headers look like ?
    Maybe like this: "Transfer-Encoding: Gzip,Chunked" with no
    Content-Encoding header?

    I have searched "RFC 2616 - Hypertext Transfer Protocol -- HTTP/1.1
    " but could not find any meaningful information for this question.

    Please help,

    Thanks in advance,
    Lior.
     
    Lior Knaany, Jan 2, 2006
    #1
    1. Advertising

  2. In article <>,
    "Lior Knaany" <> wrote:

    > But why Gzip the entire data before sending ? Is there a way that the
    > server can Gzip the chunk & then send it (doing the same for all the
    > chunks)?
    > Meaning the Gzip will not be on the entire content all together, but
    > for each chunk.
    > This way the browser could read one chunk, GUnZip it, display the
    > result & continue to the next chunk.


    Unless the chunks are really big, you're not going to get very good
    compression that way. Gzip uses an adaptive compression algorithm, so
    it gets better as the amount of data increases.

    But since gzip is also a stream compression algorithm, it can be done on
    the fly as each chunk is sent and received.

    --
    Barry Margolin,
    Arlington, MA
    *** PLEASE post questions in newsgroups, not directly to me ***
    *** PLEASE don't copy me on replies, I'll read them in the group ***
     
    Barry Margolin, Jan 3, 2006
    #2
    1. Advertising

  3. Lior Knaany

    Lior Knaany Guest

    Thanks Barry,

    I know that Gzip will work poorly on a smaller content, but can it be
    done (gzip on each chunk seperatly)?
    & if so, what should the headers look like ?
     
    Lior Knaany, Jan 3, 2006
    #3
  4. Lior Knaany

    Chris Smith Guest

    Lior Knaany <> wrote:
    > I know that Gzip will work poorly on a smaller content, but can it be
    > done (gzip on each chunk seperatly)?
    > & if so, what should the headers look like ?


    No, it can't be done. (Or rather, if you do it then general-purpose
    browsers won't understand.)

    --
    www.designacourse.com
    The Easiest Way To Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
     
    Chris Smith, Jan 4, 2006
    #4
  5. Lior Knaany

    Lior Knaany Guest

    Thanks Chris,

    That is exactly what I am experiencing when producing such a page,
    I just thought, maybe I am doing something wrong with the headers.

    Well thanks again for the info Chris.
     
    Lior Knaany, Jan 5, 2006
    #5
  6. In article <>, Chris Smith <> writes:
    > Lior Knaany <> wrote:
    > > I know that Gzip will work poorly on a smaller content, but can it be
    > > done (gzip on each chunk seperatly)?
    > > & if so, what should the headers look like ?

    >
    > No, it can't be done. (Or rather, if you do it then general-purpose
    > browsers won't understand.)


    Though as Barry pointed out, you can achieve essentially the same
    effect; neither the sender nor the receiver need buffer all the data
    and compress or decompress it at once, since gzip is a streaming
    compressor.

    There's nothing to stop the server from reading N bytes of the file
    it's sending, initializing the compressor, compressing those N bytes
    to M bytes, sending an M-byte chunk, reading the next N bytes,
    compressing those without reinitializing the compressor, and so
    forth. The receiver can treat that just as it would a content-body
    that was compressed in its entirety before chunking. The only
    difference, as far as the receiver can tell, is that the chunks will
    probably vary in size if the sender compresses each chunk in turn.

    By the same token, the receiver can initialize the decompressor
    before processing the first chunk, then pass it each chunk as it's
    received. It needn't buffer the entire compressed content-body.

    --
    Michael Wojcik

    I gave my love some irises.
    (She was sick with viruses.) -- Charlie Gibbs
     
    Michael Wojcik, Jan 6, 2006
    #6
  7. Lior Knaany

    Rogan Dawes Guest

    Chris Smith wrote:
    > Lior Knaany <> wrote:
    >
    >>I know that Gzip will work poorly on a smaller content, but can it be
    >>done (gzip on each chunk seperatly)?
    >>& if so, what should the headers look like ?

    >
    >
    > No, it can't be done. (Or rather, if you do it then general-purpose
    > browsers won't understand.)
    >


    In fact, the gzip algorithm allows for indepently gzipped content to be
    concatenated, and it will still unzip just fine.


    $ echo file 1 > file1
    $ echo file 2 > file2
    $ gzip file1 file2
    $ cat file1.gz file2.gz > file3.gz
    $ gunzip file3.gz
    $ cat file3
    file 1
    file 2
    $

    So, if you created a gzipped stream by concatenating gzipped output, the
    browser SHOULD read it as the concatenation of the uncompressed files.

    Regards,

    Rogan
     
    Rogan Dawes, Jan 9, 2006
    #7
  8. Lior Knaany

    Chris Smith Guest

    Rogan Dawes <> wrote:
    > $ echo file 1 > file1
    > $ echo file 2 > file2
    > $ gzip file1 file2
    > $ cat file1.gz file2.gz > file3.gz
    > $ gunzip file3.gz
    > $ cat file3
    > file 1
    > file 2
    > $


    Interesting...

    --
    www.designacourse.com
    The Easiest Way To Train Anyone... Anywhere.

    Chris Smith - Lead Software Developer/Technical Trainer
    MindIQ Corporation
     
    Chris Smith, Jan 9, 2006
    #8
  9. Lior Knaany

    Chris Uppal Guest

    [irrelevant and/or non-existent x-postings trimmed]

    Rogan Dawes wrote:

    > In fact, the gzip algorithm allows for indepently gzipped content to be
    > concatenated, and it will still unzip just fine.


    More accurately, the gzip /program/ will act as you describe. The compressed
    format itself, the GZIP format as specified in RFC 1952, does naturally
    concatenate, but only in the sense that a file in that format consists of a
    number of elements, each of which is an independently compressed "file" (the
    format even includes an embedded file name!).

    It's difficult to state how a browser should interpret a gzip-format stream
    which consists of several compressed elements. If the browser's decompression
    is based on the zlib library, then that library does not automatically hide the
    boundaries between the separate "files" in the stream (and nor should it), so
    it is quite possible -- even probable -- that the browser would stop
    decompressing at the end of the first compressed "file" in the stream.

    OTOH (reverting to the original poster's question), I don't see any reason why
    the server cannot send chunked and compressed data, nor any reason (except,
    perhaps, convenience) why the browser should not decompress such data
    incrementally. The underlying compression format (shared by "GZIP" and
    "DEFLATE") is capable of being flushed and/or reset in mid-stream, so the
    server could flush the compression algorithm at the end of each chunk, and that
    would be transparent to the browser as it was decompressing it (assuming the
    use of a library at least as well-designed as zlib).

    In point of fact, however, I'm not sure I see any real reason why the server
    should even bother to flush the compression algorithm -- it could just
    accumulate compressed data until it had enough for one chunk (possibly leaving
    some data in the compression code's buffers). Send that as one chunk. The
    client would decompress in the same incremental way.

    -- chris
     
    Chris Uppal, Jan 10, 2006
    #9
  10. Lior Knaany

    Lior Knaany Guest

    Thanks Michael,

    that was very enlightening
     
    Lior Knaany, Jan 16, 2006
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. ladygrinningsoul

    Reading stdout and stderr separately

    ladygrinningsoul, Nov 18, 2004, in forum: Perl
    Replies:
    3
    Views:
    1,072
    Alexey A. Kiritchun
    Apr 25, 2005
  2. BH
    Replies:
    4
    Views:
    1,144
  3. JezB
    Replies:
    3
    Views:
    394
  4. Ken Spreitzer
    Replies:
    2
    Views:
    545
    Ken Spreitzer
    Feb 12, 2006
  5. Sanjeeb
    Replies:
    3
    Views:
    448
    Ryan Kelly
    Aug 3, 2010
Loading...

Share This Page