How to uncompress HTTP GZIP data

M

Mark Smith

Hi,

Following is the response header from IIS:

HTTP/1.1 200 OK
Cache-Control: no-cache
Connection: close
Date: Mon, 19 Dec 2005 21:00:17 GMT
Pragma: no-cache
Content-Type: text/html
Expires: Wed, 01 Jan 1997 12:00:00 GMT
Server: Microsoft-IIS/6.0
P3P: CP="BUS CUR CONo FIN IVDo ONL OUR PHY SAMo TELo"
Content-Encoding: gzip
Vary: Accept-Encoding
Transfer-Encoding: chunked

The http data was captured from all of the TCP packets in sequence, and
saved to a .Z file. The file will not uncompress using GUNZIP utility.
I opened the .z file in a hex editor, and was able to see that the
header does not begin with the magic header for gzip files. I fixed
that, but it still doesn't work.

HOW COULD I UNCOMPRESS THE CAPTURED HTTP DATA?

thanks much for all your help!
 
G

Gunnar Hjalmarsson

Mark said:
Following is the response header from IIS:

HTTP/1.1 200 OK
Cache-Control: no-cache
Connection: close
Date: Mon, 19 Dec 2005 21:00:17 GMT
Pragma: no-cache
Content-Type: text/html
Expires: Wed, 01 Jan 1997 12:00:00 GMT
Server: Microsoft-IIS/6.0
P3P: CP="BUS CUR CONo FIN IVDo ONL OUR PHY SAMo TELo"
Content-Encoding: gzip
Vary: Accept-Encoding
Transfer-Encoding: chunked

The http data was captured from all of the TCP packets in sequence, and
saved to a .Z file. The file will not uncompress using GUNZIP utility.
I opened the .z file in a hex editor, and was able to see that the
header does not begin with the magic header for gzip files. I fixed
that, but it still doesn't work.

HOW COULD I UNCOMPRESS THE CAPTURED HTTP DATA?

You may find this thread useful:
http://groups.google.com/group/comp.lang.perl.modules/browse_frm/thread/fbcfa9d737888425
 
P

Paul Marquess

Mark Smith said:
Hi,

Following is the response header from IIS:

HTTP/1.1 200 OK
Cache-Control: no-cache
Connection: close
Date: Mon, 19 Dec 2005 21:00:17 GMT
Pragma: no-cache
Content-Type: text/html
Expires: Wed, 01 Jan 1997 12:00:00 GMT
Server: Microsoft-IIS/6.0
P3P: CP="BUS CUR CONo FIN IVDo ONL OUR PHY SAMo TELo"
Content-Encoding: gzip
Vary: Accept-Encoding
Transfer-Encoding: chunked

The http data was captured from all of the TCP packets in sequence, and
saved to a .Z file. The file will not uncompress using GUNZIP utility.
I opened the .z file in a hex editor, and was able to see that the
header does not begin with the magic header for gzip files. I fixed
that, but it still doesn't work.

HOW COULD I UNCOMPRESS THE CAPTURED HTTP DATA?

thanks much for all your help!

The HTTP header states that as well as having a content encoding of "gzip"
applied, it also uses chunked transfer encoding. If that is the case then
you will have to un-chunk the data before you can uncompress it.

Post a hex dump of the start of file you have saved so we can see if it
actually is chunked.

Paul
 
T

Tad McClellan

Mark Smith said:
Following is the response header from IIS:

HTTP/1.1 200 OK
Cache-Control: no-cache
Connection: close
Date: Mon, 19 Dec 2005 21:00:17 GMT
Pragma: no-cache
Content-Type: text/html
Expires: Wed, 01 Jan 1997 12:00:00 GMT
Server: Microsoft-IIS/6.0
P3P: CP="BUS CUR CONo FIN IVDo ONL OUR PHY SAMo TELo"
Content-Encoding: gzip
Vary: Accept-Encoding
Transfer-Encoding: chunked

The http data was captured from all of the TCP packets in sequence, and
saved to a .Z file. The file will not uncompress using GUNZIP utility.
I opened the .z file in a hex editor, and was able to see that the
header does not begin with the magic header for gzip files. I fixed
that, but it still doesn't work.

HOW COULD I UNCOMPRESS THE CAPTURED HTTP DATA?


It is not necessary to shout at us like that.

thanks much for all your help!


Did you have a Perl question?
 
M

Mark Smith

Hi Paul

Here are first few bytes of captured data. From what I could tell, it
is chunked. And you can also see the 'magic header' of gzip (1F8B)
starting at location 6....

610D0A1F8B08000000000004000D0A3763300D0AEC5C7973E2C6B6FF7B5CF5BE

thanks very much!

data was from IIS 6
 
M

Mark Smith

Sorry sir, I wasn't shouting. Just a bad typing style to make the
question stand out.
 
P

Paul Marquess

Mark said:
Hi Paul

Here are first few bytes of captured data. From what I could tell, it
is chunked. And you can also see the 'magic header' of gzip (1F8B)
starting at location 6....

610D0A1F8B08000000000004000D0A3763300D0AEC5C7973E2C6B6FF7B5CF5BE

Yep, that's chunked and gzipped.

Try filtering it through something like this to unchunk it, then see if
gunzip can deal with it.

undef $/;
$_ = <>;

while (length $_)
{
last if /^\r\n0+\r\n\r\n/;

die "error dechunking\n"
unless s/^(?:\r\n)?([0-9A-F]+)\r\n//i ;

my $chunkSize = hex $1 ;

print substr($_, 0, $chunkSize) ;
substr($_, 0, $chunkSize) = '';
}


Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top