urllib2, https and gzipped files

Barry · Sep 19, 2009

I'm trying to use urllib2 to download some gzipped files from an https
server, but I cannot correctly open the file. It happens to be an mbox
file -- a mailing list archive to be exact.

Upon calling open, the file starts to be unzipped. Content-Length is
read as the length of the first post in the archive and exactly that
amount of text is downloaded and that's it.

I can do this manually in a browser, but cannot do it any other way. I
couldn't find a solution searching on the web, but tested wget and
curl -- and both of them mess up in a similar way as my python code.
curl is exactly the same. It gets the first few thousand bytes as text
and stops. wget, tries a second time and downloads the remaining
number of bytes to match the actual compressed file size, but the
second part just looks like random bytes.

The same code works on other sites with the same archive; but the
difference is that they are http connections, not https.

Any ideas?

Barry

Can I directly upload MBOX files to Zoho Mail?	1	May 4, 2026
Can I directly upload MBOX files to MDaemon server?	0	Apr 30, 2026
Can I upload PST files to Office 365 online archive mailbox?	0	Mar 20, 2026
Can I upload MBOX files directly into Roundcube?	1	Apr 3, 2026
How can I import PST files into Apple Mail easily?	2	Apr 25, 2026
Is it possible to import multiple MBOX files into Apple Mail at once?	0	Apr 16, 2026
Can I Import MBOX Files to Hotmail Without Outlook?	1	Mar 23, 2026
Is it possible to open MBOX files in Maildir format directly?	0	Apr 20, 2026

urllib2, https and gzipped files

Barry

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads