How to read from URL line-wise?

K

kj

First, profuse apologies for the original posting of this query,
which had gibberish ("sdfsdfsf") in the Subject: line. I goofed.

---

I'm looking for the "moral equivalent" of the (fictitious) `openremote`
function below:

my $handle = openremote( 'http://some.domain.org/huge.tsv' ) or die $!;
while ( <$handle> ) {
chomp;
# etc.
# do stuff with $_
}
close $handle;

IOW, I'm looking for a way to open a read handle to a remote file
so that I can read from it *line-by-line*. (Typically this file
will be larger than I want to read all at once into memory. IOW,
I want to avoid solutions based on stuffing the value returned into
LWP::Simple::get into an IO::String.)

I'm sure this is really basic stuff, but I have not been able to
find it after a lot of searching.

TIA!

kj
 
P

Peter J. Holzer

kj said:
I'm looking for the "moral equivalent" of the (fictitious) `openremote`
function below:

my $handle = openremote( 'http://some.domain.org/huge.tsv' ) or die $!; [...]
IOW, I'm looking for a way to open a read handle to a remote file
so that I can read from it *line-by-line*. (Typically this file
will be larger than I want to read all at once into memory. IOW,
I want to avoid solutions based on stuffing the value returned into
LWP::Simple::get into an IO::String.)

I'm sure this is really basic stuff, but I have not been able to
find it after a lot of searching.

I very much doubt that HTTP supports such a line-by-line retrieval.

Not line-by-line (files don't support that either on most platforms),
but byte-ranges are supported by HTTP/1.1. Whether the server supports
it for the file is another question, but most servers do for files
stored in the file system (but not dynamically created content).

But I associate "line-by-line" with sequential access, not random
access, and you are of course always free to process the response in
little chunks as you receive it (see "Handlers in LWP::UserAgent for a
standard way of doing this).

hp
 
K

kj

In said:
kj said:
I'm looking for the "moral equivalent" of the (fictitious) `openremote`
function below:

my $handle = openremote( 'http://some.domain.org/huge.tsv' ) or die $!; [...]
IOW, I'm looking for a way to open a read handle to a remote file
so that I can read from it *line-by-line*. (Typically this file
will be larger than I want to read all at once into memory. IOW,
I want to avoid solutions based on stuffing the value returned into
LWP::Simple::get into an IO::String.)

I'm sure this is really basic stuff, but I have not been able to
find it after a lot of searching.
I very much doubt that HTTP supports such a line-by-line retrieval. And
if line-by-line is not supported by the underlying protocol, then at the
very best you can only hope for a local simulation, but at that point
the resource has been retrived in full.already.

I was under the impression that HTTP supported incremental downloads
(some fixed number of bytes at a time); if so, a client could easily
implement a line-by-line interface to that stream... But now I
think I need to do some homework and review HTTP.

Thanks!

kj
 
K

kj

In said:
...you are of course always free to process the response in
little chunks as you receive it (see "Handlers in LWP::UserAgent for a
standard way of doing this).

Thanks for this pointer! This approaches what I'm after. I'd
hoped to find a package (in some obscure corner of LWP) that already
implemented this line-oriented interface to the stream, but I guess
I'll have to write it myself. (Conceptually it's not a hard thing
to do, but IME *robust* implementations of even simple tasks like
this one can take a lot more work than one would expect.)

kj
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,905
Latest member
Kristy_Poole

Latest Threads

Top