httplib/socket problems reading 404 Not Found response

P

Patrick Altman

I am attempting to use a HEAD request against Amazon S3 to check
whether a file exists or not and if it does parse the md5 hash from
the ETag in the response to verify the contents of the file so as to
save on bandwidth of uploading files when it is not necessary.

If the file exist, the HEAD works as expected and I get valid headers
back that I can parse and pull the ETag out of the dictionary using
getheader('ETag')[1:-1] (using the slice to trim off the double-quotes
in the string.

The problem lies when I attempt to send a HEAD request when no file
exists. As expected, a 404 Not Found response is sent back from
Amazon however, my test scripts seem to hang. I run python with
trace.py and it hangs here:

--- modulename: httplib, funcname: _read_chunked
httplib.py(536): assert self.chunked != _UNKNOWN
httplib.py(537): chunk_left = self.chunk_left
httplib.py(538): value = ''
httplib.py(542): while True:
httplib.py(543): if chunk_left is None:
httplib.py(544): line = self.fp.readline()
--- modulename: socket, funcname: readline
socket.py(321): data = self._rbuf
socket.py(322): if size < 0:
socket.py(324): if self._rbufsize <= 1:
socket.py(326): assert data == ""
socket.py(327): buffers = []
socket.py(328): recv = self._sock.recv
socket.py(329): while data != "\n":
socket.py(330): data = recv(1)

It eventually completes with an exception here:

File "C:\Python25\lib\httplib.py", line 509, in read
return self._read_chunked(amt)
File "C:\Python25\lib\httplib.py", line 548, in _read_chunked
chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: ''

For reference, ethereal captured the following request and response:

HEAD <REMOVED> HTTP/1.1
Host: s3.amazonaws.com
Accept-Encoding: identity
Date: Tue, 13 Mar 2007 02:54:12 GMT
Authorization: AWS <REMOVED>

HTTP/1.1 404 Not Found
x-amz-request-id: E20B4C0D0C48B2EF
x-amz-id-2: <REMOVED>
Content-Type: application/xml
Transfer-Encoding: chunked
Date: Tue, 13 Mar 2007 02:54:16 GMT
Server: AmazonS3

Am I doing something wrong? Is this a known issue? I am an
experienced developer, but pretty new to Python and dynamic languages
in general.

Thanks,
Patrick
 
G

Gabriel Genellina

I am attempting to use a HEAD request against Amazon S3 to check
whether a file exists or not and if it does parse the md5 hash from
the ETag in the response to verify the contents of the file so as to
save on bandwidth of uploading files when it is not necessary.
The problem lies when I attempt to send a HEAD request when no file
exists. As expected, a 404 Not Found response is sent back from
Amazon however, my test scripts seem to hang. I run python with
trace.py and it hangs here:

Yes, it's a known problem. See this message with a self-response:
http://mail.python.org/pipermail/python-list/2006-March/375087.html
 
P

Patrick Altman

Submit a bug report, if not already done.http://sourceforge.net/tracker/?group_id=5470

Bug already exists at:
https://sourceforge.net/tracker/index.php?func=detail&aid=1486335&group_id=5470&atid=105470

In the meantime, I implemented a work around for my specific case in
the Amazon S3 library in that I implemented a head() method but am
actually just requesting a GET operation with a very small byte
range. This is essentially yielding all the same header data that I
need (md5 hash in the ETag if the file exists, 404 Not Found if it
doesn't).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top