client to upload big files via https and get progress info

Discussion in 'Python' started by News123, May 12, 2010.

  1. News123

    News123 Guest

    Hi,

    I'd like to perform huge file uploads via https.
    I'd like to make sure,
    - that I can obtain upload progress info (sometimes the nw is very slow)
    - that (if the file exceeds a certain size) I don't have to
    read the entire file into RAM.

    I found Active states recipe 146306, which constructs the whole
    multipart message first in RAM and sends it then in one chunk.


    I found a server side solutions, that will write out the data file chunk
    wise ( http://webpython.codepoint.net/mod_python_publisher_big_file_upload
    )



    If I just wanted to have progress info, then I could probably
    just split line 16 of Active State's recipe ( h.send(body) )
    into multiple send, right?

    chunksize = 1024
    for i in range(0,len(body),chunksize):
    h.send(body[i:i+chunksize])
    show_progressinfo()


    But how could I create body step by step?
    I wouldn't know the content-length up front?

    thanks in advance



    N
     
    News123, May 12, 2010
    #1
    1. Advertising

  2. News123

    Aahz Guest

    In article <4bea6b50$0$8925$>,
    News123 <> wrote:
    >
    >I'd like to perform huge file uploads via https.
    >I'd like to make sure,
    >- that I can obtain upload progress info (sometimes the nw is very slow)
    >- that (if the file exceeds a certain size) I don't have to
    > read the entire file into RAM.


    Based on my experience with this, you really need to send multiple
    requests (i.e. "chunking"). There are ways around this (you can look
    into curl's resumable uploads), but you will need to maintain state no
    matter what, and I think that chunking is the best/simplest.
    --
    Aahz () <*> http://www.pythoncraft.com/

    f u cn rd ths, u cn gt a gd jb n nx prgrmmng.
     
    Aahz, May 13, 2010
    #2
    1. Advertising

  3. News123

    James Mills Guest

    On Wed, May 12, 2010 at 6:48 PM, News123 <> wrote:
    > Hi,
    >
    > I'd like to perform huge file uploads via https.
    > I'd like to make sure,
    > - that I can obtain upload progress info (sometimes the nw is very slow)
    > - that (if the file exceeds a certain size) I don't have to
    >  read the entire file into RAM.
    >
    > I found Active states recipe 146306, which constructs the whole
    > multipart message first in RAM and sends it then in one chunk.
    >
    >
    > I found a server side solutions, that will write out the data file chunk
    > wise ( http://webpython.codepoint.net/mod_python_publisher_big_file_upload
    > )
    >
    >
    >
    > If I just wanted to have progress info, then I could probably
    > just split line 16 of Active State's recipe ( h.send(body) )
    > into multiple send, right?
    >
    > chunksize = 1024
    > for i in range(0,len(body),chunksize):
    >    h.send(body[i:i+chunksize])
    >    show_progressinfo()
    >
    >
    > But how could I create body step by step?
    > I wouldn't know the content-length up front?
    >
    > thanks in advance


    My suggestion is to find some tools that can
    send multiple chucks of data. A non-blocking
    i/o library/tool might be useful here (eg: twisted or similar).

    cheers
    James
     
    James Mills, May 13, 2010
    #3
  4. News123

    News123 Guest

    Hi Aaaz,

    Aahz wrote:
    > In article <4bea6b50$0$8925$>,
    > News123 <> wrote:
    >> I'd like to perform huge file uploads via https.
    >> I'd like to make sure,
    >> - that I can obtain upload progress info (sometimes the nw is very slow)
    >> - that (if the file exceeds a certain size) I don't have to
    >> read the entire file into RAM.

    >
    > Based on my experience with this, you really need to send multiple
    > requests (i.e. "chunking"). There are ways around this (you can look
    > into curl's resumable uploads), but you will need to maintain state no
    > matter what, and I think that chunking is the best/simplest.

    I agree I need chunking. (the question is just on which level of the
    protocol)

    I just don't know how to make a chunkwise file upload or what library is
    best.

    Can you recommend any libraries or do you have a link to an example?


    I'd like to avoid to make separate https post requests for the chunks
    (at least if the underlying module does NOT support keep-alive connections)


    I made some tests with high level chunking (separate sequential https
    post requests).
    What I noticed is a rather high penalty in data throughput.
    The reason is probably, that each request makes its own https connection
    and that either the NW driver or the TCP/IP stack doesn't allocate
    enough band width to my request.

    Therefore I'd like to do the chunking on a 'lower' level.
    One option would be to have a https module, which supports keep-alive,

    the other would be to have a library, which creates a http post body
    chunk by chunk.


    What do others do for huge file uploads
    The uploader might be connected via ethernet, WLAN, UMTS, EDGE, GPRS. )

    N
     
    News123, May 13, 2010
    #4
  5. News123

    Sean DiZazzo Guest

    On May 13, 9:39 am, News123 <> wrote:
    > Hi Aaaz,
    >
    > Aahz wrote:
    > > In article <4bea6b50$0$8925$>,
    > > News123  <> wrote:
    > >> I'd like to perform huge file uploads via https.
    > >> I'd like to make sure,
    > >> - that I can obtain upload progress info (sometimes the nw is very slow)
    > >> - that (if the file exceeds a certain size) I don't have to
    > >>  read the entire file into RAM.

    >
    > > Based on my experience with this, you really need to send multiple
    > > requests (i.e. "chunking").  There are ways around this (you can look
    > > into curl's resumable uploads), but you will need to maintain state no
    > > matter what, and I think that chunking is the best/simplest.

    >
    > I agree I need  chunking. (the question is just on which level of the
    > protocol)
    >
    > I just don't know how to make a chunkwise file upload or what library is
    > best.
    >
    > Can you recommend any libraries or do you have a link to an example?
    >
    > I'd like to avoid to make separate https post requests for the chunks
    > (at least if the underlying module does NOT support keep-alive connections)
    >
    > I made some tests with high level chunking (separate sequential https
    > post requests).
    > What I noticed is a rather high penalty in data throughput.
    > The reason is probably, that each request makes its own https connection
    > and that either the NW driver or the TCP/IP stack doesn't allocate
    > enough band width to my request.
    >
    > Therefore I'd like to do the chunking on a 'lower' level.
    > One option would be to have a https module, which supports keep-alive,
    >
    > the other would be  to have a library, which creates a http post body
    > chunk by chunk.
    >
    > What do others do for huge file uploads
    > The uploader might be connected via ethernet, WLAN, UMTS, EDGE, GPRS. )
    >
    > N


    You could also just send the file in one big chunk and give yourself
    another avenue to read the size of the file on the server. Maybe a
    webservice that you call with the name of the file that returns it's
    percent complete, or it could just return bytes on disk and you do the
    math on the client side. Then you just forget about the transfer and
    query the file size whenever you want to know...or on a schedule.

    ~Sean
     
    Sean DiZazzo, May 14, 2010
    #5
  6. News123

    Sean DiZazzo Guest

    On May 13, 9:54 pm, Sean DiZazzo <> wrote:
    > On May 13, 9:39 am, News123 <> wrote:
    >
    >
    >
    > > Hi Aaaz,

    >
    > > Aahz wrote:
    > > > In article <4bea6b50$0$8925$>,
    > > > News123  <> wrote:
    > > >> I'd like to perform huge file uploads via https.
    > > >> I'd like to make sure,
    > > >> - that I can obtain upload progress info (sometimes the nw is very slow)
    > > >> - that (if the file exceeds a certain size) I don't have to
    > > >>  read the entire file into RAM.

    >
    > > > Based on my experience with this, you really need to send multiple
    > > > requests (i.e. "chunking").  There are ways around this (you can look
    > > > into curl's resumable uploads), but you will need to maintain state no
    > > > matter what, and I think that chunking is the best/simplest.

    >
    > > I agree I need  chunking. (the question is just on which level of the
    > > protocol)

    >
    > > I just don't know how to make a chunkwise file upload or what library is
    > > best.

    >
    > > Can you recommend any libraries or do you have a link to an example?

    >
    > > I'd like to avoid to make separate https post requests for the chunks
    > > (at least if the underlying module does NOT support keep-alive connections)

    >
    > > I made some tests with high level chunking (separate sequential https
    > > post requests).
    > > What I noticed is a rather high penalty in data throughput.
    > > The reason is probably, that each request makes its own https connection
    > > and that either the NW driver or the TCP/IP stack doesn't allocate
    > > enough band width to my request.

    >
    > > Therefore I'd like to do the chunking on a 'lower' level.
    > > One option would be to have a https module, which supports keep-alive,

    >
    > > the other would be  to have a library, which creates a http post body
    > > chunk by chunk.

    >
    > > What do others do for huge file uploads
    > > The uploader might be connected via ethernet, WLAN, UMTS, EDGE, GPRS. )

    >
    > > N

    >
    > You could also just send the file in one big chunk and give yourself
    > another avenue to read the size of the file on the server.  Maybe a
    > webservice that you call with the name of the file that returns it's
    > percent complete, or it could just return bytes on disk and you do the
    > math on the client side.  Then you just forget about the transfer and
    > query the file size whenever you want to know...or on a schedule.
    >
    > ~Sean


    oops...that doesn't help with the other requirements. My suggestion
    is to not use https. I don't think it was created to move around
    large pieces of data. Lots of small pieces rather. SFTP?
     
    Sean DiZazzo, May 14, 2010
    #6
  7. News123

    J.O. Aho Guest

    News123 <> wrote:

    > What do others do for huge file uploads
    > The uploader might be connected via ethernet, WLAN, UMTS, EDGE, GPRS. )


    Those cases where I have had to move big files it's been scp on those cases
    where you just have to push a new file, in cases where it's a question of
    keeping two directories synced, then it's rsync over ssh.
    The later one I have never done in python.


    --

    //Aho
     
    J.O. Aho, May 14, 2010
    #7
  8. News123

    News123 Guest

    Hi Sean,




    Sean DiZazzo wrote:
    > On May 13, 9:54 pm, Sean DiZazzo <> wrote:
    >> On May 13, 9:39 am, News123 <> wrote:
    >>
    >>
    >>
    >>> Hi Aaaz,
    >>> Aahz wrote:
    >>>> In article <4bea6b50$0$8925$>,
    >>>> News123 <> wrote:
    >>>>> I'd like to perform huge file uploads via https.
    >>>>> I'd like to make sure,


    >
    > oops...that doesn't help with the other requirements. My suggestion
    > is to not use https. I don't think it was created to move around
    > large pieces of data. Lots of small pieces rather. SFTP?



    I had to check, but I guess sftp is not exactly suitable for my usecase.

    My problem
    - the whole communication is to be intended to work like a drop box.
    - one can upload files
    - one can not see, what one has uploaded before
    - no way to accidentally overwrite a previous upload, etc.
    - I don't know enough about sftp servers to know how I could configure
    it to act as a drop box.


    That's much easier to hide behind an https server than behind an out of
    the box sftp server.



    N
     
    News123, May 14, 2010
    #8
  9. News123

    News123 Guest

    Hi James,

    James Mills wrote:
    > On Wed, May 12, 2010 at 6:48 PM, News123 <> wrote:
    >> Hi,
    >>
    >> I'd like to perform huge file uploads via https.
    >> I'd like to make sure,
    >> - that I can obtain upload progress info (sometimes the nw is very slow)
    >> - that (if the file exceeds a certain size) I don't have to
    >> read the entire file into RAM.
    >>

    >
    > My suggestion is to find some tools that can
    > send multiple chucks of data. A non-blocking
    > i/o library/tool might be useful here (eg: twisted or similar).
    >


    I never used twisted so far.
    Perhaps the time to look at it.


    bye


    N
     
    News123, May 14, 2010
    #9
  10. News123

    News123 Guest

    Hi J,


    J.O. Aho wrote:
    > News123 <> wrote:
    >
    >> What do others do for huge file uploads
    >> The uploader might be connected via ethernet, WLAN, UMTS, EDGE, GPRS. )

    >
    > Those cases where I have had to move big files it's been scp on those cases
    > where you just have to push a new file, in cases where it's a question of
    > keeping two directories synced, then it's rsync over ssh.
    > The later one I have never done in python.



    I agree. From home this is also what I do.
    scp / rsync.


    However I'd like to use https, as http/https are the two ports, that are
    almost everywhere accessible (eve with proxies / firewalls, etc.)



    N
     
    News123, May 14, 2010
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Shaguf
    Replies:
    0
    Views:
    565
    Shaguf
    Dec 24, 2008
  2. Shaguf
    Replies:
    0
    Views:
    514
    Shaguf
    Dec 26, 2008
  3. Shaguf
    Replies:
    0
    Views:
    281
    Shaguf
    Dec 26, 2008
  4. pbd22
    Replies:
    1
    Views:
    1,020
    pbd22
    Nov 5, 2009
  5. Shaguf
    Replies:
    0
    Views:
    261
    Shaguf
    Dec 24, 2008
Loading...

Share This Page