How can I keep LWP::UserAgent from adding the http-equiv strings fromthe Head section of the page?

Discussion in 'Perl Misc' started by CronJob, Mar 18, 2009.

  1. CronJob

    CronJob Guest

    How can I keep LWP::UserAgent from adding the http-equiv strings from
    the Head section of the page? When I run the following program below,
    the $headers variable contains three Content-Type: listings. One from
    the actual http header and one from the meta tag in the web page.

    #!/usr/bin/perl -w

    use LWP::UserAgent;
    use HTML::parse;
    use HTML::Element;
    use HTTP::Response;
    use HTTP::Request;
    use HTTP::Status;
    use URI::URL;

    my ($code, $desc, $headers, $body)=&makeRequest('GET', 'http://
    www.google.com');
    print "The headers:\n$headers\n";
    print "The body:\n$body\n";

    sub makeRequest( ) {
    ($method, $path) = @_;
    # create a user agent object
    my $ua = new LWP::UserAgent;
    $ua->agent("Mozilla/4.0");

    # request a url
    my $request = new HTTP::Request($method, $path);
    # set values in response object HTTP::Reponse
    my $response = $ua->request($request);

    # get the details if there is an error
    # otherwise parse the response object
    my $body=$response->content;
    my $code=$response->code;
    my $desc=HTTP::Status::status_message($code);
    my $headers=$response->headers_as_string;
    $body = $response->error_as_HTML if ($response->is_error);
    return ($code, $desc, $headers, $body);
    }
     
    CronJob, Mar 18, 2009
    #1
    1. Advertising

  2. CronJob

    CronJob Guest

    Re: How can I keep LWP::UserAgent from adding the http-equiv stringsfrom the Head section of the page?

    On Mar 18, 5:17 pm, Ben Morrow <> wrote:
    > Quoth CronJob <>:
    >
    > > How can I keep LWP::UserAgent from adding the http-equiv strings from
    > > the Head section of the page? When I run the following program below,
    > > the $headers variable contains three Content-Type: listings. One from
    > > the actual http header and one from the meta tag in the web page.

    >
    > See the ->parse_head method of LWP::UserAgent.
    >
    > You might want to try reading the docs of the modules you are using.
    >
    > Ben


    Yes I agree with you. Unfortunately for me, I find the form that is
    used in the perl documentation to be abstruse. I learn by working with
    example code, not by reading abstract discussions about how code is
    that do not contain working examples. Hopefully it will come to me
    over time. I had the same issue with man pages years ago, but now its
    second nature. I appreciate your response and I will look through the
    documentation carefully.
     
    CronJob, Mar 19, 2009
    #2
    1. Advertising

  3. CronJob

    CronJob Guest

    Re: How can I keep LWP::UserAgent from adding the http-equiv stringsfrom the Head section of the page?

    Thank you Ben.

    I ran 'perldoc LWP' and found:

    The class name for the user agent is "LWP::UserAgent".
    <snip>
    · The parse_head specifies whether we should initialize
    response headers from the <head> section of HTML docu-
    ments.

    Running 'perldoc LWP::UserAgent' I see that:

    $ua = LWP::UserAgent->new( %options )
    This method constructs a new "LWP::UserAgent" object and
    returns it. Key/value pair arguments may be pro-
    vided to set up the initial state. The following options
    correspond to attribute methods described below:

    KEY DEFAULT
    ----------- --------------------
    parse_head 1


    I now realize that the 1 is implicitly a boolean value, and hence that
    0 should do the trick for me.

    Working code:

    #!/usr/bin/perl -w

    use strict;
    use LWP::UserAgent;
    use HTML::parse;
    use HTML::Element;
    use HTTP::Response;
    use HTTP::Request;
    use HTTP::Status;
    use URI::URL;

    my $ie7UAString = 'Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0;
    en-US)';
    my ($code, $desc, $headers,$body) = &LWPUserAgentRequest('GET','http://
    www.google.com');
    print "The headers:\n$headers\n";
    print "The body:\n$body\n";

    sub LWPUserAgentRequest {
    my ($method, $path) = @_;
    my $ua = new LWP::UserAgent;
    $ua->agent($ie7UAString);
    $ua->parse_head(0);
    my $request = new HTTP::Request($method, $path);
    my $response = $ua->request($request);
    my $body = $response->content;
    $body = $response->error_as_HTML if ($response->is_error);
    my $code = $response->code;
    my $desc = HTTP::Status::status_message($code);
    my $headers = $response->headers_as_string;
    return ($code, $desc, $headers, $body);
    }
     
    CronJob, Mar 19, 2009
    #3
  4. CronJob

    J. Gleixner Guest

    Re: How can I keep LWP::UserAgent from adding the http-equiv stringsfrom the Head section of the page?

    CronJob wrote:
    [...]
    > Working code:
    >
    > #!/usr/bin/perl -w
    >
    > use strict;
    > use LWP::UserAgent;
    > use HTML::parse;
    > use HTML::Element;
    > use HTTP::Response;
    > use HTTP::Request;
    > use HTTP::Status;
    > use URI::URL;


    Some minor tweaks..


    Do you really need all of those?

    >
    > my $ie7UAString = 'Mozilla/5.0 (Windows; U; MSIE 7.0; Windows NT 6.0; en-US)';
    > my ($code, $desc, $headers,$body) = &LWPUserAgentRequest('GET','http://www.google.com');


    Remove the '&'------------------------^

    If you add a '/' to the end of the URL, then the Web server doesn't
    have to do it for you.

    > print "The headers:\n$headers\n";
    > print "The body:\n$body\n";


    You can call print once, with a list:

    print "The headers:\n$headers\n",
    "The body:\n$body\n";
    >
    > sub LWPUserAgentRequest {
    > my ($method, $path) = @_;


    Usually, it's nice to have a blank line after initializing
    the input parameters.

    > my $ua = new LWP::UserAgent;


    my $ua = LWP::UserAgent->new();

    > $ua->agent($ie7UAString);
    > $ua->parse_head(0);
    > my $request = new HTTP::Request($method, $path);


    my $request = HTTP::Request->new( $method, $path );

    > my $response = $ua->request($request);
    > my $body = $response->content;
    > $body = $response->error_as_HTML if ($response->is_error);


    my $body = ( $response->is_error )
    ? $response->error_as_HTML
    : $response->content;

    > my $code = $response->code;
    > my $desc = HTTP::Status::status_message($code);
    > my $headers = $response->headers_as_string;


    Ya don't really need $headers, you could just return
    $response->headers_as_string, instead of $headers, below.

    > return ($code, $desc, $headers, $body);
    > }
    >
     
    J. Gleixner, Mar 20, 2009
    #4
  5. Re: How can I keep LWP::UserAgent from adding the http-equiv strings from the Head section of the page?

    J. Gleixner <> wrote:
    > CronJob wrote:



    >> my $ua = new LWP::UserAgent;

    >
    > my $ua = LWP::UserAgent->new();


    >> my $request = new HTTP::Request($method, $path);

    >
    > my $request = HTTP::Request->new( $method, $path );



    Just in case you're wondering why this suggested change is
    a Really Good Idea, see the "Indirect Object Syntax" section in:

    perldoc perlobj


    --
    Tad McClellan
    email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"
     
    Tad J McClellan, Mar 20, 2009
    #5
  6. Re: How can I keep LWP::UserAgent from adding the http-equivstrings from the Head section of the page?

    On 2009-03-20, J. Gleixner <> wrote:
    > CronJob wrote:

    *SKIP*
    >> print "The headers:\n$headers\n";
    >> print "The body:\n$body\n";

    >
    > You can call print once, with a list:
    >
    > print "The headers:\n$headers\n",
    > "The body:\n$body\n";


    With such outrageous number of newlines I would suggest

    print <<"EOT";
    The headers:
    $headers
    The body:
    $body
    EOT

    *CUT*

    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom
     
    Eric Pozharski, Mar 20, 2009
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Donald Gordon

    UTF-8, LWP and http-equiv meta tags

    Donald Gordon, Feb 25, 2004, in forum: Perl
    Replies:
    0
    Views:
    582
    Donald Gordon
    Feb 25, 2004
  2. Vinay Gupta
    Replies:
    2
    Views:
    1,116
    Michael Schlenker
    Jul 27, 2004
  3. Brian W
    Replies:
    10
    Views:
    792
    Brian W
    Jul 2, 2003
  4. Jiho Han

    Write into <HEAD></HEAD> section?

    Jiho Han, Jan 14, 2004, in forum: ASP .Net Building Controls
    Replies:
    6
    Views:
    243
    Sam Fields
    Jan 16, 2004
  5. Steve in NY

    Net::HTTP, LWP::UserAgent, IO::Socket, etc

    Steve in NY, Jul 24, 2003, in forum: Perl Misc
    Replies:
    2
    Views:
    108
    Steve in NY
    Jul 25, 2003
Loading...

Share This Page