G
Great Deals
Here is my code:
#!/usr/bin/perl
use Net::HTTP;
use LWP::UserAgent;
$ua = LWP::UserAgent->new (agent => 'Mozilla/4.0', );
$ua->max_size(2000);
$url = 'http://news.google.com'; # for instance, no trailing /
$htmlcode = $ua->get($url, Range => 'bytes=1000-')->content;
print $htmlcode;
#################
First of all, if I put 2000 or 1000 or 750 in max_size, the result is
the same, I don't know why, but if I put 400 there, the downloaded is
much smaller.
Secondly, Range => 'bytes=1000-' does not seem to work. I only want to
fetch the middle part of the page, not from the beginning. How could I
do that?
Here is the header I sent via nettransport
2003-10-01 11:35:38.713 Connecting to news.google.com:80
2003-10-01 11:35:38.713 Connecting to 216.239.33.104:80
2003-10-01 11:35:38.963 Connected
2003-10-01 11:35:38.963 GET / HTTP/1.1
2003-10-01 11:35:38.963 Host: news.google.com
2003-10-01 11:35:38.963 Referer: http://news.google.com
2003-10-01 11:35:38.963 Accept: */*
2003-10-01 11:35:38.963 User-Agent: Mozilla/4.0
2003-10-01 11:35:38.963 Range: bytes=12485-
2003-10-01 11:35:38.963 Connection: close
Here is the header file which google gave me when I use net-transport:
2003-10-01 11:35:40.085 HTTP/1.1 200 OK
2003-10-01 11:35:40.085 Date: Wed, 01 Oct 2003 15:35:38 GMT
2003-10-01 11:35:40.085 Server: GWS/2.1
2003-10-01 11:35:40.085 Content-length: 67700
2003-10-01 11:35:40.085 Cache-control: no-cache, must-revalidate
2003-10-01 11:35:40.085 Expires: Fri, 01 Jan 1990 00:00:00 GMT
2003-10-01 11:35:40.085 Pragma: no-cache
2003-10-01 11:35:40.085 Last-Modified: Wed, 01 Oct 2003 15:30:28 GMT
2003-10-01 11:35:40.085 Content-Type: text/html
#!/usr/bin/perl
use Net::HTTP;
use LWP::UserAgent;
$ua = LWP::UserAgent->new (agent => 'Mozilla/4.0', );
$ua->max_size(2000);
$url = 'http://news.google.com'; # for instance, no trailing /
$htmlcode = $ua->get($url, Range => 'bytes=1000-')->content;
print $htmlcode;
#################
First of all, if I put 2000 or 1000 or 750 in max_size, the result is
the same, I don't know why, but if I put 400 there, the downloaded is
much smaller.
Secondly, Range => 'bytes=1000-' does not seem to work. I only want to
fetch the middle part of the page, not from the beginning. How could I
do that?
Here is the header I sent via nettransport
2003-10-01 11:35:38.713 Connecting to news.google.com:80
2003-10-01 11:35:38.713 Connecting to 216.239.33.104:80
2003-10-01 11:35:38.963 Connected
2003-10-01 11:35:38.963 GET / HTTP/1.1
2003-10-01 11:35:38.963 Host: news.google.com
2003-10-01 11:35:38.963 Referer: http://news.google.com
2003-10-01 11:35:38.963 Accept: */*
2003-10-01 11:35:38.963 User-Agent: Mozilla/4.0
2003-10-01 11:35:38.963 Range: bytes=12485-
2003-10-01 11:35:38.963 Connection: close
Here is the header file which google gave me when I use net-transport:
2003-10-01 11:35:40.085 HTTP/1.1 200 OK
2003-10-01 11:35:40.085 Date: Wed, 01 Oct 2003 15:35:38 GMT
2003-10-01 11:35:40.085 Server: GWS/2.1
2003-10-01 11:35:40.085 Content-length: 67700
2003-10-01 11:35:40.085 Cache-control: no-cache, must-revalidate
2003-10-01 11:35:40.085 Expires: Fri, 01 Jan 1990 00:00:00 GMT
2003-10-01 11:35:40.085 Pragma: no-cache
2003-10-01 11:35:40.085 Last-Modified: Wed, 01 Oct 2003 15:30:28 GMT
2003-10-01 11:35:40.085 Content-Type: text/html