get($url) of BETSIE parser.

P

P.R.Brady

I am having problems reading a particular web page on our site,
http://www.bangor.ac.uk/cgi-bin/textonlyparser.pl

If it is referenced with a real browser from another page eg
http://www.bangor.ac.uk/ by clicking 'text only version' in top right
corner, it works fine.
If I paste the url into the browser, or access it with LWP::UserAgent
(See demo code below) it fails 500 Internal Server Error.

The file on the server is a Perl script running the BBC's BETSIE parser
which takes a standard page and processes it on the fly to make it more
acceptable for the visually impaired. The script has references to
environment variables like $ENV{'SERVER_NAME'} and $ENV{'SCRIPT_NAME'}
which I infer are used to grab the page to be parsed. I think they are
missing when it fails.

I'd appreciate any insight into how those variables get set by the
browser and how I can emulate it.

Regards
Phil Brady

#---------------------------
use strict;
use warnings;
use LWP::UserAgent;

my $url='http://www.bangor.ac.uk/cgi-bin/textonlyparser.pl';

#open the browser
my $browser = LWP::UserAgent->new;

#try to get the url:

my $response = $browser->get($url);
print 'Response:',$response->is_success,"\n";
print 'Status line:',$response->status_line,"\n";
print 'Content Type:',$response->content_type,"\n";
print 'Base: ',$response->base,"\n";
 
G

Gunnar Hjalmarsson

P.R.Brady said:
I am having problems reading a particular web page on our site,
http://www.bangor.ac.uk/cgi-bin/textonlyparser.pl

If it is referenced with a real browser from another page eg
http://www.bangor.ac.uk/ by clicking 'text only version' in top
right corner, it works fine.
If I paste the url into the browser, or access it with
LWP::UserAgent (See demo code below) it fails 500 Internal Server
Error.

So, they are requiring a particular HTTP_REFERER.
my $response = $browser->get($url);

Try:

my $response =
$browser->get($url, Referer => 'http://www.bangor.ac.uk/');
 
T

Tad McClellan

P.R.Brady said:
I am having problems reading a particular web page on our site, ^^^^^^^^^^^

(See demo code below) it fails 500 Internal Server Error.


What did it say in your server error log?

Did the CGI program run OK when you tried running it from the command line
rather than in the CGI environment?

I'd appreciate any insight into how those variables get set by the
browser


Server environment variables are not set by the browser, they
are set by the server.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
DewittMill
Top