H
Hon Guin Lee - Web Producer - SMI Marketing
Hi All,
I am having a problem that the LWP get() function cannot retrieve unlocalised content that begins with www from the specified URL entered on the web form I created. Using Mozilla 1.1, it just cannot retrieve the web document (hence it fails and returns undef - within the subroutine get_url), but for localised web content on the local web server, it can retrieve most web documents with no problem.
Looking at the problem at a different perspective, I used other functions such as getstore(url,file) and mirror(url,file) where url is replaced with shift and a specified filename, the LWP:
ebug just throws up some internal server errors displayed on the web browser requiring some a proxy: -
--------------------------------------------------------------------------
LWP::UserAgent::new: () LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://sunweb.central.sun.com LWP::UserAgent::_need_proxy: Not proxied LWP:
rotocol::http::request: () LWP::UserAgent::request: Simple response: Found LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://sunweb.central.sun.com/redirect.jsp LWP::UserAgent::_need_proxy: Not proxied LWP:
rotocol::http::request: () LWP:
rotocol::collect: read 57 bytes LWP::UserAgent::request: Simple response:
Found LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://sunweb.central.sun.com/location.jsp LWP::UserAgent::_need_proxy: Not proxied LWP:
rotocol::http::request: () LWP:
rotocol::collect: read 19 bytes LWP::UserAgent::request: Simple response: Found LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://sunweb.central.sun.com/redirect.jsp?location=Non-US LWP::UserAgent::_need_proxy: Not proxied LWP:
rotocol::http::request: () LWP:
rotocol::collect: read 57
bytes LWP::UserAgent::request: Simple response: Found LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://sunweb.central.sun.com/cachedir/cachedtab_Non-US_NEWS.html LWP::UserAgent::_need_proxy: Not proxied LWP:
rotocol::http::request: () LWP::UserAgent::request: Simple response: Internal Server Error 500
-- This is for a localised URL.
--------------------------------------------------------------------------
LWP::UserAgent::new: () LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://www.sun.com LWP::UserAgent::_need_proxy: Not proxied LWP:
rotocol::http::request: () LWP::UserAgent::request: Simple response: Internal Server Error 500
-- This is for a URL that begins with www.
--------------------------------------------------------------------------
Here is the script: -
#!/usr/local/perl5.6/bin/perl -wT
# perl script to get remote
# urls and strip them and
# upload them to teamsite
use LWP::Simple qw(!head);
use LWP:
ebug '+';
use CGI qw
standard); # then only CGI.pm defines a head()
use strict;
print "Content-type: text/html\n\n";
my $old_handle;
$|++; #sets $| for STDOUT
$old_handle = select( STDERR ); #change to STDERR
$|++; #sets $| for STDERR
select( $old_handle ); #change back to STDOUT
my ($url) = @_;
my $lang;
process_form();
get_url($url);
# Passes the data from the server,
# and takes them onto the PERL script.
sub process_form {
$url = param('url');
$url = "http://$url";
$lang = param('lang');
}
# Retrieves the contents of the
# specified URL.
sub get_url {
my $page = getstore(shift,'hon.html');
unless (defined $page) {
print "Couldn't retrieve $url";
}
else {
print "$page\n";
}
}
I am having a problem that the LWP get() function cannot retrieve unlocalised content that begins with www from the specified URL entered on the web form I created. Using Mozilla 1.1, it just cannot retrieve the web document (hence it fails and returns undef - within the subroutine get_url), but for localised web content on the local web server, it can retrieve most web documents with no problem.
Looking at the problem at a different perspective, I used other functions such as getstore(url,file) and mirror(url,file) where url is replaced with shift and a specified filename, the LWP:
--------------------------------------------------------------------------
LWP::UserAgent::new: () LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://sunweb.central.sun.com LWP::UserAgent::_need_proxy: Not proxied LWP:
Found LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://sunweb.central.sun.com/location.jsp LWP::UserAgent::_need_proxy: Not proxied LWP:
bytes LWP::UserAgent::request: Simple response: Found LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://sunweb.central.sun.com/cachedir/cachedtab_Non-US_NEWS.html LWP::UserAgent::_need_proxy: Not proxied LWP:
-- This is for a localised URL.
--------------------------------------------------------------------------
LWP::UserAgent::new: () LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://www.sun.com LWP::UserAgent::_need_proxy: Not proxied LWP:
-- This is for a URL that begins with www.
--------------------------------------------------------------------------
Here is the script: -
#!/usr/local/perl5.6/bin/perl -wT
# perl script to get remote
# urls and strip them and
# upload them to teamsite
use LWP::Simple qw(!head);
use LWP:
use CGI qw
use strict;
print "Content-type: text/html\n\n";
my $old_handle;
$|++; #sets $| for STDOUT
$old_handle = select( STDERR ); #change to STDERR
$|++; #sets $| for STDERR
select( $old_handle ); #change back to STDOUT
my ($url) = @_;
my $lang;
process_form();
get_url($url);
# Passes the data from the server,
# and takes them onto the PERL script.
sub process_form {
$url = param('url');
$url = "http://$url";
$lang = param('lang');
}
# Retrieves the contents of the
# specified URL.
sub get_url {
my $page = getstore(shift,'hon.html');
unless (defined $page) {
print "Couldn't retrieve $url";
}
else {
print "$page\n";
}
}