A couple of vague LWP questions

F

Franklin H.

1) When using LWP::Simple to grab a webpage the GET request
occasionally and irreproducibly appears to hang and does not return.
Any clue as to why this could conceivably occur? There doesn't appear
to be a way to set the request timeout with this particular module but
perhaps someone may know of a workaround?

2) When using LWP::UserAgent to grab the same webpage as above the
webserver somehow seems able to recognizes the request as coming from
an "automated tool". Any idea why this might possibly occure with
LWP::UserAgent but not with LWP::Simple?

TYIA,
Fr.
 
F

Franklin H.

2) When using LWP::UserAgent to grab the same webpage as above the
webserver somehow seems able to recognizes the request as coming from
an "automated tool". Any idea why this might possibly occure with
LWP::UserAgent but not with LWP::Simple?

It would appear that the trick here is to set USERAGENt to something
other than the default "libwww-perl/#.##". Arbitrarily I chise:

$ua->agent('Mozilla/5.001');
 
F

Franklin H.

2) When using LWP::UserAgent to grab the same webpage as above the
webserver somehow seems able to recognizes the request as coming from
an "automated tool". Any idea why this might possibly occure with
LWP::UserAgent but not with LWP::Simple?

It would appear that the trick here is to set USERAGENT to something
other than the default "libwww-perl/#.##".

Arbitrarily I chose: $ua->agent('Mozilla/5.001');
 
B

Brian Wakem

Franklin said:
It would appear that the trick here is to set USERAGENt to something
other than the default "libwww-perl/#.##". Arbitrarily I chise:

$ua->agent('Mozilla/5.001');


If you are trying to blend in with normal traffic then I suggest using -

Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

- which is IE6 on Windows XP.


The answer to your other question is either use LWP::UserAgent and use the
timeout function provdied ( $ua->timeout( $secs ) ), or use alarm.

eval {
local $SIG{ALRM} = sub { die "timeout" };
alarm $secs;
$response = get($url);
alarm 0;
};
if ($@ =~ m/timeout/) {
# timed out
}
 
F

Franklin H.

Well I am tryting t9o make this platform independent and as such would
hate to run into problems with $SIG{ALRM} on XP.

Similarly, mightn't "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
" be suspicious if the request came from a LINUX OS?
 
C

Charles DeRykus

1) When using LWP::Simple to grab a webpage the GET request
occasionally and irreproducibly appears to hang and does not return.
Any clue as to why this could conceivably occur? There doesn't appear
to be a way to set the request timeout with this particular module but
perhaps someone may know of a workaround?

LWP::Simple's is built on LWP::UserAgent so you can import
$ua and invoke a timeout,e.g:

use LWP qw($ua); $ua->timeout(10);

See LWP::Simple doc for discussion of above.
2) When using LWP::UserAgent to grab the same webpage as above the
webserver somehow seems able to recognizes the request as coming from
an "automated tool". Any idea why this might possibly occure with
LWP::UserAgent but not with LWP::Simple?

Some servers may be checking the user agent id. No idea why
LWP::Simple would slip by if that's the case. Again see
LWP::UserAgent vs LWP::Simple docs or how to alter setting.

hth,
 
M

Mark Clements

Franklin said:
Well I am tryting t9o make this platform independent and as such would
hate to run into problems with $SIG{ALRM} on XP.

Similarly, mightn't "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
" be suspicious if the request came from a LINUX OS?
Nah. The remote server only sees an HTTP request: it has no idea from
what type of system the request originated, other than what is in the
HTTP headers.

Mark
 
J

Joe Smith

Charles said:
Some servers may be checking the user agent id. No idea why
LWP::Simple would slip by if that's the case.

perldoc LWP::UserAgent
the default agent identifier is "libwww-perl/#.##"

Line 43 of LWP/Simple.pm
$ua->agent("LWP::Simple/$LWP::VERSION");
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top