Proxying downloads

Discussion in 'Python' started by Martin Marcher, Oct 30, 2007.

  1. Hello,

    more a recipe question. I'm working on a proxy that will download a
    file for a client. The thing that doesn't yield problems is:

    Alice (Client)
    Bob (Client)
    Sam (Server)

    1 Alice asks Sam for "foobar.iso"
    2 Sam can't find "foobar.iso" in "cachedir"
    3 Sam requests "foobar.iso" from the uplink
    4 Sam now saves each chunk received to "cachedir/foobar.iso"
    5 At the same time Sam forwards each chunk to Alice.

    But I can't figure out how I would solve the following:

    1 Alice asks Sam for "foobar.iso"
    2 Sam can't find "foobar.iso" in "cachedir"
    3 Sam requests "foobar.iso" from uplink
    4 Sam saves and forwards to Alice
    5 At about 30 % of the download Bob asks Sam for "foobar.iso"
    6 How do I serve Bob now?

    Now because the internal link is _a lot_ faster than the uplink Bob
    will probably reach the end of (the local) "foobar.iso" before Sam has
    received "foobar.iso" in total from uplink. So Bob will end up with a
    incomplete file...

    How do I solve that. The already downloaded data should of course be
    served internally.

    The solutions I think of are
    * Some kind of subscriber list for the file in question
    * That is serve internally and if the state of "foobar.iso" is in
    progress switch to receiving chunk directly from Sam as it comes down
    the link
    * How would I realize this switch from internal serving to pass thru
    of chunks?

    * Send an acknowledge (lie to the client that we have this file in
    the cache) wait until it's finished and then serve the file from the
    internal cache)
    * This could lead to timeouts for very large files, at least I think so

    * Forget about all of it and just pass thru from uplink, with a new
    request, as long as files are in progress. This would in the worst
    case download the file n times where n is the number of clients.
    * I guess that's the easiest one but also the least desirable solution.

    I hope I explained my problem somehow understandable.

    any hints are welcome
    thanks
    martin

    --
    http://noneisyours.marcher.name
    http://feeds.feedburner.com/NoneIsYours
    Martin Marcher, Oct 30, 2007
    #1
    1. Advertising

  2. Martin Marcher

    Jeff Guest

    You use a temp directory to store the file while downloading, then
    move it to the cache so the addition of the complete file is atomic.
    The file name of the temp file should be checked to validate that you
    don't overwrite another process' download.

    Currently downloading urls should be registered with the server
    process (a simple list or set would work). New requests should be
    checked against that; if there is a matching url in there, the process
    must wait until that download is finished and that file should be
    delivered to both Alice and Bob.

    You need to store the local file path and the url it was downloaded
    from and checking against that when a request is made; there might be
    two foobar.iso files on the Internet or the network, and they may be
    different (such as in differently versioned directories).
    Jeff, Oct 30, 2007
    #2
    1. Advertising

  3. > But I can't figure out how I would solve the following:
    >
    > 1 Alice asks Sam for "foobar.iso"
    > 2 Sam can't find "foobar.iso" in "cachedir"
    > 3 Sam requests "foobar.iso" from uplink
    > 4 Sam saves and forwards to Alice
    > 5 At about 30 % of the download Bob asks Sam for "foobar.iso"
    > 6 How do I serve Bob now?


    Let every file in your download cache be represented by a Python object.
    Instead of streaming the file directly to the clients, you can stream
    the objects. The object will know if the file it represents has finished
    downloading or not, where the file is located etc. This way you can
    also, for the sake of persistence, keep partially downloaded files
    separate from the completely downloaded files, as per a previous
    suggestion, so that you won't start serving half files after a crash,
    and it'll be completely transparent in all code except for your proxy
    file objects.

    Martin
    Martin Sand Christensen, Oct 30, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Richard B. Christ

    Proxying the call to a JNI method

    Richard B. Christ, Aug 4, 2004, in forum: Java
    Replies:
    1
    Views:
    343
    Jim Sculley
    Aug 5, 2004
  2. Jim Campbell
    Replies:
    2
    Views:
    344
    Jim Campbell
    Oct 22, 2003
  3. Chris McCormick

    Proxying object memory for synchronous update.

    Chris McCormick, Apr 2, 2007, in forum: Python
    Replies:
    0
    Views:
    225
    Chris McCormick
    Apr 2, 2007
  4. Jeff Dege

    HTML proxying in ASP.NET?

    Jeff Dege, Oct 2, 2007, in forum: ASP .Net
    Replies:
    3
    Views:
    371
    Jeff Dege
    Oct 5, 2007
  5. Giampaolo Rodolà

    __getattribute__ and methods proxying

    Giampaolo Rodolà, Jun 12, 2010, in forum: Python
    Replies:
    0
    Views:
    240
    Giampaolo Rodolà
    Jun 12, 2010
Loading...

Share This Page