how check new URL of redirected page

Discussion in 'Perl Misc' started by zawszedamian_p@gazeta.pl, Dec 5, 2007.

  1. Guest

    I read webpage using HTTP::Response but the page was redirected - how
    can I read new url?
    I tried HTTP::Response->base() but it returns orginal url.

    Thanks
    , Dec 5, 2007
    #1
    1. Advertising

  2. Ben Morrow Guest

    Quoth :
    >
    > I read webpage using HTTP::Response but the page was redirected - how
    > can I read new url?
    > I tried HTTP::Response->base() but it returns orginal url.


    If you mean a proper HTTP redirect rather than an HTML meta-refresh or
    something more evil in JavaScript,

    $response->header('Location');

    However, LWP::UserAgent will follow redirects by default, so unless
    you've turned it off this won't help :(. If the page is HTML with a
    meta-refresh, you will need to parse it with e.g. HTML::parser and
    extract the <meta> elements, and find the one with the refresh in. If
    it's using JS, you're out of luck, unless the pages you are working with
    have similar pieces of JS every time and you can see how to extract the
    URL.

    Ben
    Ben Morrow, Dec 5, 2007
    #2
    1. Advertising

  3. On Dec 5, 7:26 am, Ben Morrow <> wrote:
    > Quoth :
    >
    >
    >
    > > I read webpage using HTTP::Response but the page was redirected - how
    > > can I read new url?
    > > I tried HTTP::Response->base() but it returns orginal url.

    >
    > If you mean a proper HTTP redirect rather than an HTML meta-refresh or
    > something more evil in JavaScript,
    >
    > $response->header('Location');
    >
    > However, LWP::UserAgent will follow redirects by default, so unless
    > you've turned it off this won't help :(.
    > ...


    A possibly more convenient alternative to turning off redirects
    entirely is LWP's simple_request which won't follow redirects:

    my $resp = $ua->simple_request($request);
    if ( $resp->code == 302 ) {
    $uri = URI->new($resp->header('Location'));
    ...

    --
    Charles DeRykus
    comp.llang.perl.moderated, Dec 5, 2007
    #3
  4. Ted Zlatanov Guest

    On Thu, 6 Dec 2007 02:53:01 +0100 "Petr Vileta" <> wrote:

    PV> HTML::parser is "too big gun to small rabbit" :) For meta element
    PV> base redirections is successful some like this

    PV> # I precede that html page is in variable $content
    PV> $content=~s/^.?(<meta\s+?HTTP-EQUIV=.REFRESH..+?>).+$/$1/si;
    PV> $content=~s/^.+?url=(.+?)[\'\">]

    PV> Now $content contain new URL.

    This is like using garrote wire to catch and strangle the rabbit :)

    Ted
    Ted Zlatanov, Dec 6, 2007
    #4
  5. Ben Morrow Guest

    Quoth "Petr Vileta" <>:
    > Sorry Ben, please do not kill me, but HTML::parser is "too big gun to small
    > rabbit" :)
    > For meta element base redirections is successful some like this
    >
    > # I precede that html page is in variable $content


    my $content = <<HTML;
    <html>
    <head>
    <!-- <meta HTTP-EQUIV="REFRESH" url="some/fake/url"> -->
    < META content=10;url=foo http-equiv=refresh>
    </head>
    <body>
    Hello world!
    </body>
    </html>
    HTML

    > $content=~s/^.?(<meta\s+?HTTP-EQUIV=.REFRESH..+?>).+$/$1/si;
    > $content=~s/^.+?url=(.+?)[\'\">]


    This line is not valid Perl.

    > Now $content contain new URL.


    No, it doesn't.

    LWP::UserAgent will parse the <head> section of a text/html document for
    you, and return the http-equiv headers in with the real HTTP headers.
    For this purpose it uses HTML::HeadParser, which, guess what, is a
    subclass of HTML::parser. This means that a refresh can be detected with

    $response->header('refresh');

    Ben
    Ben Morrow, Dec 6, 2007
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. darrel
    Replies:
    4
    Views:
    764
    darrel
    Sep 29, 2004
  2. Kaidi
    Replies:
    3
    Views:
    1,349
    Kaidi
    Jan 4, 2004
  3. Steve
    Replies:
    2
    Views:
    512
    =?Utf-8?B?UGV0ZXIgQnJvbWJlcmcgW0MjIE1WUF0=?=
    Nov 1, 2007
  4. Muggle

    Finding the Redirected URL

    Muggle, Aug 26, 2008, in forum: Java
    Replies:
    1
    Views:
    3,859
    Stefan Ram
    Aug 26, 2008
  5. Roedy Green
    Replies:
    4
    Views:
    872
    Roedy Green
    Nov 17, 2010
Loading...

Share This Page