Decompressing a file retrieved by URL seems too complex

Discussion in 'Python' started by John Nagle, Aug 12, 2010.

  1. John Nagle

    John Nagle Guest

    (Repost with better indentation)
    I'm reading a URL which is a .gz file, and decompressing
    it. This works, but it seems far too complex. Yet
    none of the "wrapping" you might expect to work
    actually does. You can't wrap a GzipFile around
    an HTTP connection, because GzipFile, reasonably enough,
    needs random access, and tries to do "seek" and "tell".
    Nor is the output descriptor from gzip general; it fails
    on "readline", but accepts "read". (No good reason
    for that.) So I had to make a second copy.

    John Nagle

    def readurl(url) :
    if url.endswith(".gz") :
    nd = urllib2.urlopen(url,timeout=TIMEOUTSECS)
    td1 = tempfile.TemporaryFile() # compressed file
    td1.write(nd.read()) # fetch and copy file
    nd.close() # done with network
    td2 = tempfile.TemporaryFile() # decompressed file
    td1.seek(0) # rewind
    gd = gzip.GzipFile(fileobj=td1, mode="rb") # wrap unzip
    td2.write(gd.read()) # decompress file
    td1.close() # done with compressed copy
    td2.seek(0) # rewind
    return(td2) # return file object for compressed object
    else :
    return(urllib2.urlopen(url,timeout=TIMEOUTSECS))
     
    John Nagle, Aug 12, 2010
    #1
    1. Advertisements

  2. Good, good.
    The file name could be anything. You should be checking the reponse Content-
    Type header -- that's what it's for.
    You can keep the whole thing in memory by using StringIO.
    You're reading the entire fire into memory anyway ;-)
    Okay, maybe there is somthing missing from GzipFile -- but still you could use
    StringIO again, I expect.
    What exactly is it that's failing, and how?
     
    Thomas Jollans, Aug 12, 2010
    #2
    1. Advertisements

  3. John Nagle

    Aahz Guest

    Also consider using zlib directly.
     
    Aahz, Aug 13, 2010
    #3
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.