Decompressing a file retrieved by URL seems too complex

Discussion in 'Python' started by John Nagle, Aug 12, 2010.

  1. John Nagle

    John Nagle Guest

    (Repost with better indentation)
    I'm reading a URL which is a .gz file, and decompressing
    it. This works, but it seems far too complex. Yet
    none of the "wrapping" you might expect to work
    actually does. You can't wrap a GzipFile around
    an HTTP connection, because GzipFile, reasonably enough,
    needs random access, and tries to do "seek" and "tell".
    Nor is the output descriptor from gzip general; it fails
    on "readline", but accepts "read". (No good reason
    for that.) So I had to make a second copy.

    John Nagle

    def readurl(url) :
    if url.endswith(".gz") :
    nd = urllib2.urlopen(url,timeout=TIMEOUTSECS)
    td1 = tempfile.TemporaryFile() # compressed file
    td1.write( # fetch and copy file
    nd.close() # done with network
    td2 = tempfile.TemporaryFile() # decompressed file # rewind
    gd = gzip.GzipFile(fileobj=td1, mode="rb") # wrap unzip
    td2.write( # decompress file
    td1.close() # done with compressed copy # rewind
    return(td2) # return file object for compressed object
    else :
    John Nagle, Aug 12, 2010
    1. Advertisements

  2. Good, good.
    The file name could be anything. You should be checking the reponse Content-
    Type header -- that's what it's for.
    You can keep the whole thing in memory by using StringIO.
    You're reading the entire fire into memory anyway ;-)
    Okay, maybe there is somthing missing from GzipFile -- but still you could use
    StringIO again, I expect.
    What exactly is it that's failing, and how?
    Thomas Jollans, Aug 12, 2010
    1. Advertisements

  3. John Nagle

    Aahz Guest

    Also consider using zlib directly.
    Aahz, Aug 13, 2010
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.