HTTP POST uploading large files

W

Wolfgang Draxinger

I'm thinking about writing a script to upload videos to sites
like YouTube or Google Video, which is usually done by a HTTP
POST.

The problem is, that videos, by nature are rather big files,
however urllib2 wants it's Request objects being prepared
beforehand, which would mean to first load the whole file to
memory.

I looked into pycURL, knowing that cURL can POST send files
directily from the file system, however pycURL doesn't expose
the neccesary functions yet.

Am I just blind for some urllib2/httplib feature, or some other
library? Or do I really have to fiddle around with sockets
myself (I hope not...).

Thanks in advance

Wolfgang Draxinger
 
G

Gabriel Genellina

En Sat, 19 Jan 2008 21:19:24 -0200, Wolfgang Draxinger
I'm thinking about writing a script to upload videos to sites
like YouTube or Google Video, which is usually done by a HTTP
POST.

The problem is, that videos, by nature are rather big files,
however urllib2 wants it's Request objects being prepared
beforehand, which would mean to first load the whole file to
memory.

I looked into pycURL, knowing that cURL can POST send files
directily from the file system, however pycURL doesn't expose
the neccesary functions yet.

Am I just blind for some urllib2/httplib feature, or some other
library? Or do I really have to fiddle around with sockets
myself (I hope not...).

I'm afraid urllib2 currently doesn't handle this. Neither the lower layer,
httplib. HTTPConnection should be upgraded to handle 'Transfer-Encoding:
chunked', by example. (Chunked responses are handled correctly, but a
request cannot be chunked)

A Q&D approach would be to patch httplib.HTTPConnection.send, to accept a
file or file-like argument. Around line 707, instead of
self.sock.sendall(str):

if hasattr(str, 'read'):
BUFSIZE = 4*1024
while True:
block = str.read(BUFSIZE)
if not block: break
self.sock.sendall(block)
else:
self.sock.sendall(str)

and ensure the Content-Length header is already set, so no attempt is made
to compute len(str)
 
B

Brian Smith

Wolfgang said:
The problem is, that videos, by nature are rather big files,
however urllib2 wants it's Request objects being prepared
beforehand, which would mean to first load the whole file to memory.

Try using mmap. Here is some untested code:

map = mmap(file.fileno(), len(file), access=ACCESS_READ)
try:
data = mmap.read()
request = Request(url, data, headers)
...
finally:
map.close()


- Brian
 
P

Paul Rubin

Wolfgang Draxinger said:
Am I just blind for some urllib2/httplib feature, or some other
library? Or do I really have to fiddle around with sockets
myself (I hope not...).

I did something like that by just opening a socket and writing the
stuff with socket.sendall. It's only about 5 lines of code and it's
pretty straightforward.
 
W

Wolfgang Draxinger

Paul said:
I did something like that by just opening a socket and writing
the
stuff with socket.sendall. It's only about 5 lines of code and
it's pretty straightforward.

Well, for YouTube you've to fiddle around with cookies,
form/multipart data and stuff like that. It's a bit more than
just opening a socket, there's some serious HTTP going on.

However I found a solution: The curl program, that comes with
libcurl and can be found on most *nix systems allows to do
pretty sophisticated HTTP requests, among them also sending
files by POST. So instead of using urllib2 or sockets from
Python, now my program just generates the appropriate calls to
curl, provides the in memory storage of cookies and does the
neccesary HTML parsing*.

Wolfgang Draxinger

*) YouTube uploads videos in a two part process: First you set
the various video options, in return you get a form with some
hidden input fields, some of them providing a handle to the
already sent video information. That data has to be extracted
from the form and be put into the POST that also transfers the
video file.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top