Proxying downloads

Martin Marcher · Oct 30, 2007

Hello,

more a recipe question. I'm working on a proxy that will download a
file for a client. The thing that doesn't yield problems is:

Alice (Client)
Bob (Client)
Sam (Server)

1 Alice asks Sam for "foobar.iso"
2 Sam can't find "foobar.iso" in "cachedir"
3 Sam requests "foobar.iso" from the uplink
4 Sam now saves each chunk received to "cachedir/foobar.iso"
5 At the same time Sam forwards each chunk to Alice.

But I can't figure out how I would solve the following:

1 Alice asks Sam for "foobar.iso"
2 Sam can't find "foobar.iso" in "cachedir"
3 Sam requests "foobar.iso" from uplink
4 Sam saves and forwards to Alice
5 At about 30 % of the download Bob asks Sam for "foobar.iso"
6 How do I serve Bob now?

Now because the internal link is _a lot_ faster than the uplink Bob
will probably reach the end of (the local) "foobar.iso" before Sam has
received "foobar.iso" in total from uplink. So Bob will end up with a
incomplete file...

How do I solve that. The already downloaded data should of course be
served internally.

The solutions I think of are
* Some kind of subscriber list for the file in question
* That is serve internally and if the state of "foobar.iso" is in
progress switch to receiving chunk directly from Sam as it comes down
the link
* How would I realize this switch from internal serving to pass thru
of chunks?

* Send an acknowledge (lie to the client that we have this file in
the cache) wait until it's finished and then serve the file from the
internal cache)
* This could lead to timeouts for very large files, at least I think so

* Forget about all of it and just pass thru from uplink, with a new
request, as long as files are in progress. This would in the worst
case download the file n times where n is the number of clients.
* I guess that's the easiest one but also the least desirable solution.

I hope I explained my problem somehow understandable.

any hints are welcome
thanks
martin

Jeff · Oct 30, 2007

You use a temp directory to store the file while downloading, then
move it to the cache so the addition of the complete file is atomic.
The file name of the temp file should be checked to validate that you
don't overwrite another process' download.

Currently downloading urls should be registered with the server
process (a simple list or set would work). New requests should be
checked against that; if there is a matching url in there, the process
must wait until that download is finished and that file should be
delivered to both Alice and Bob.

You need to store the local file path and the url it was downloaded
from and checking against that when a request is made; there might be
two foobar.iso files on the Internet or the network, and they may be
different (such as in differently versioned directories).

Martin Sand Christensen · Oct 30, 2007

But I can't figure out how I would solve the following:

1 Alice asks Sam for "foobar.iso"
2 Sam can't find "foobar.iso" in "cachedir"
3 Sam requests "foobar.iso" from uplink
4 Sam saves and forwards to Alice
5 At about 30 % of the download Bob asks Sam for "foobar.iso"
6 How do I serve Bob now?

Let every file in your download cache be represented by a Python object.
Instead of streaming the file directly to the clients, you can stream
the objects. The object will know if the file it represents has finished
downloading or not, where the file is located etc. This way you can
also, for the sake of persistence, keep partially downloaded files
separate from the completely downloaded files, as per a previous
suggestion, so that you won't start serving half files after a crash,
and it'll be completely transparent in all code except for your proxy
file objects.

Martin

How can I upload a tar.bz2 file to OpenStack swift object storage container using the Python swift client?	1	Mar 22, 2024
extract stream title from the output of mplayer	0	Mar 18, 2014
Digital Signature field form in PDF generated document from HTML	5	Nov 16, 2022
parralel downloads	7	Mar 8, 2008
I need help in understanding these files on my phone, Could someone help me understand these files? Urgent help needed. Please help.	1	Jun 4, 2023
Issue with passing fetched data to POST form. How can I?	0	Jul 23, 2023
Downloads	5	Apr 15, 2004
Mocking file downloads	1	Apr 4, 2010

Proxying downloads

Martin Marcher

Jeff

Martin Sand Christensen

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads