cgi.fieldstorage()

G

gert

this is a non standard way to store multi part post data on disk

def application(environ, response):
with open('/usr/httpd/var/wsgiTemp','w') as f:
while True:
chunk = environ['wsgi.input'].read(8192).decode('latin1')
if not chunk: break
f.write(chunk)
response('200 OK',[])
return ['complete']

my question is how do i handle the file, so i can shuffle it into a db
using small chunks of memorie ?
 
D

Diez B. Roggisch

gert said:
this is a non standard way to store multi part post data on disk

def application(environ, response):
with open('/usr/httpd/var/wsgiTemp','w') as f:
while True:
chunk = environ['wsgi.input'].read(8192).decode('latin1')
if not chunk: break
f.write(chunk)
response('200 OK',[])
return ['complete']

my question is how do i handle the file, so i can shuffle it into a db
using small chunks of memorie ?

I don't think that's possible with the current DB-API. There is no
stream-based BLOB-interface (as e.g. JDBC offers).

So the answer certainly depends on your used RDBMS. For oracle, you
would be lucky:

http://cx-oracle.sourceforge.net/html/lob.html


Other adapters I don't know about.

Diez
 
G

gert

gert said:
this is a non standard way to store multi part post data on disk
def application(environ, response):
    with open('/usr/httpd/var/wsgiTemp','w') as f:
        while True:
            chunk = environ['wsgi.input'].read(8192).decode('latin1')
            if not chunk: break
            f.write(chunk)
    response('200 OK',[])
    return ['complete']
my question is how do i handle the file, so i can shuffle it into a db
using small chunks of memorie ?

I don't think that's possible with the current DB-API. There is no
stream-based BLOB-interface (as e.g. JDBC offers).

So the answer certainly depends on your used RDBMS. For oracle, you
would be lucky:

http://cx-oracle.sourceforge.net/html/lob.html

Other adapters I don't know about.

sqlite :) ok let say for now it would be impossible on a db level, but
before i reach the impossible, i still need to parse the file to
prepare the chunks. How do i do that ? How do i get the chunks without
loading the hole file into memorie ?

b = environ['CONTENT_TYPE'].split('boundary=')[1]
data = search(b+r'.*?Content-Type: application/octet-stream\r\n\r
\n
(.*?)\r\n--'+b,t,DOTALL).group(1)
data = data.encode('latin1')
 
D

Diez B. Roggisch

gert said:
gert said:
this is a non standard way to store multi part post data on disk
def application(environ, response):
with open('/usr/httpd/var/wsgiTemp','w') as f:
while True:
chunk = environ['wsgi.input'].read(8192).decode('latin1')
if not chunk: break
f.write(chunk)
response('200 OK',[])
return ['complete']
my question is how do i handle the file, so i can shuffle it into a db
using small chunks of memorie ?
I don't think that's possible with the current DB-API. There is no
stream-based BLOB-interface (as e.g. JDBC offers).

So the answer certainly depends on your used RDBMS. For oracle, you
would be lucky:

http://cx-oracle.sourceforge.net/html/lob.html

Other adapters I don't know about.

sqlite :) ok let say for now it would be impossible on a db level, but
before i reach the impossible, i still need to parse the file to
prepare the chunks. How do i do that ? How do i get the chunks without
loading the hole file into memorie ?

It's memory - memorie might be some nice dutch girl you know :)

Apart from that, your code above does exactly that - reads the data
chunkwise. If the WSGI-implementation works proper, this will be the
socket's input stream, so there is no memory overhead involved.

Now of course if you want to have a multi-pass processing of the file
without putting it into memory, then you need to save it to the harddisk
befor.

But honestly - we are talking about a web-application here. My
DSL-connection has 1 MBit upstream, the average server has at least 2GB
of memory available - so we are talking 20000 seconds uploading time to
fill that memory. Which is about 5 hours. And you are decoding the data
to a certain decoding, so we are not talking about binary data here -
are you really sure memory is an issue?

Diez

Diez
 
G

gert

gert said:
gert schrieb:
this is a non standard way to store multi part post data on disk
def application(environ, response):
    with open('/usr/httpd/var/wsgiTemp','w') as f:
        while True:
            chunk = environ['wsgi.input'].read(8192).decode('latin1')
            if not chunk: break
            f.write(chunk)
    response('200 OK',[])
    return ['complete']
my question is how do i handle the file, so i can shuffle it into a db
using small chunks of memorie ?
I don't think that's possible with the current DB-API. There is no
stream-based BLOB-interface (as e.g. JDBC offers).
So the answer certainly depends on your used RDBMS. For oracle, you
would be lucky:
http://cx-oracle.sourceforge.net/html/lob.html
Other adapters I don't know about.
sqlite :) ok let say for now it would be impossible on a db level, but
before i reach the impossible, i still need to parse the file to
prepare the chunks. How do i do that ? How do i get the chunks without
loading the hole file into memorie ?

It's memory - memorie might be some nice dutch girl you know :)

Apart from that, your code above does exactly that - reads the data
chunkwise. If the WSGI-implementation works proper, this will be the
socket's input stream, so there is no memory overhead involved.

Now of course if you want to have a multi-pass processing of the file
without putting it into memory, then you need to save it to the harddisk
befor.

But honestly - we are talking about a web-application here. My
DSL-connection has 1 MBit upstream, the average server has at least 2GB
of memory available - so we are talking 20000 seconds uploading time to
fill that memory. Which is about 5 hours. And you are decoding the data
to a certain decoding, so we are not talking about binary data here -
are you really sure memory is an issue?

What about some http upload resume features ?
 
D

Diez B. Roggisch

gert said:
gert said:
gert schrieb:
this is a non standard way to store multi part post data on disk
def application(environ, response):
with open('/usr/httpd/var/wsgiTemp','w') as f:
while True:
chunk = environ['wsgi.input'].read(8192).decode('latin1')
if not chunk: break
f.write(chunk)
response('200 OK',[])
return ['complete']
my question is how do i handle the file, so i can shuffle it into a db
using small chunks of memorie ?
I don't think that's possible with the current DB-API. There is no
stream-based BLOB-interface (as e.g. JDBC offers).
So the answer certainly depends on your used RDBMS. For oracle, you
would be lucky:
http://cx-oracle.sourceforge.net/html/lob.html
Other adapters I don't know about.
sqlite :) ok let say for now it would be impossible on a db level, but
before i reach the impossible, i still need to parse the file to
prepare the chunks. How do i do that ? How do i get the chunks without
loading the hole file into memorie ?
It's memory - memorie might be some nice dutch girl you know :)

Apart from that, your code above does exactly that - reads the data
chunkwise. If the WSGI-implementation works proper, this will be the
socket's input stream, so there is no memory overhead involved.

Now of course if you want to have a multi-pass processing of the file
without putting it into memory, then you need to save it to the harddisk
befor.

But honestly - we are talking about a web-application here. My
DSL-connection has 1 MBit upstream, the average server has at least 2GB
of memory available - so we are talking 20000 seconds uploading time to
fill that memory. Which is about 5 hours. And you are decoding the data
to a certain decoding, so we are not talking about binary data here -
are you really sure memory is an issue?

What about some http upload resume features ?

What has that todo with reading data into memory or not? Of course you
can create continous uploads with HTTP, but a single HTTP-request is a
single HTTP-request. You can store the requests into individual blobs,
and then when the overall file upload is finished, concatenate these. If
you like. Or not, if you don't care, as you can as easily serve them
even as one stream from those various chunks.

But all of this has nothing todo with the need to not load POST-data
into memory. This only becomes a problem if you are expecting uploads in
the hundreds of megabytes.

Diez
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,265
Latest member
TodLarocca

Latest Threads

Top