lwp::simple get (why it would stop working along with wget when fetch still works)

Discussion in 'Perl Misc' started by rockerd@gmail.com, Jul 18, 2007.

  1. Guest

    Hi Perl People,
    Something recently changed on a site that I was fetching and parsing
    from with lwp::simple.
    Here is the thing: For the longest time I was using get() to grab a
    http: site and store it in a scalar which I parsed later. Suddenly I
    get an empty but defined scalar with: $html = get($url);

    More: when I use fetch on a freebsd system it pulls the page to text
    without any problems but when I use wget on a linux system I get a
    blank file. Everything used to work. I tried changing my user-agent
    headers and have had no luck. The only thing I can see is that the
    file has an unknown length.. but I don't know what to do.

    Thanks for the advice,
    Rocker
    , Jul 18, 2007
    #1
    1. Advertising

  2. Re: lwp::simple get (why it would stop working along with wget whenfetch still works)

    wrote:
    > Something recently changed on a site that I was fetching and parsing
    > from with lwp::simple.
    > Here is the thing: For the longest time I was using get() to grab a
    > http: site and store it in a scalar which I parsed later. Suddenly I
    > get an empty but defined scalar with: $html = get($url);


    Maybe the web server doesn't like requests that are generated by Perl.
    :( You may want to try without sending a client identifier:

    use LWP::UserAgent;
    my $ua = LWP::UserAgent->new;
    $ua->agent(''); # <- This line may make a difference
    my $response = $ua->get('http://www.perl.org/');
    print $response->content;

    --
    Gunnar Hjalmarsson
    Email: http://www.gunnar.cc/cgi-bin/contact.pl
    Gunnar Hjalmarsson, Jul 18, 2007
    #2
    1. Advertising

  3. On 2007-07-18 02:17, <> wrote:
    > Something recently changed on a site that I was fetching and parsing
    > from with lwp::simple.
    > Here is the thing: For the longest time I was using get() to grab a
    > http: site and store it in a scalar which I parsed later. Suddenly I
    > get an empty but defined scalar with: $html = get($url);


    Use LWP::Simple only if you are absolutely sure that you never need the
    return code or headers. LWP::UserAgent is almost always the better
    choice, especially if you have to handle errors or strange behaviour.


    > More: when I use fetch on a freebsd system it pulls the page to text
    > without any problems but when I use wget on a linux system I get a
    > blank file. Everything used to work. I tried changing my user-agent
    > headers and have had no luck.


    Is "a linux system" the system where the script normally runs and "a
    freebsd system" a different system? It might be that the owner of the
    site noticed that you are automatically retrieving data and blocking
    your IP address.

    hp


    --
    _ | Peter J. Holzer | I know I'd be respectful of a pirate
    |_|_) | Sysadmin WSR | with an emu on his shoulder.
    | | | |
    __/ | http://www.hjp.at/ | -- Sam in "Freefall"
    Peter J. Holzer, Jul 21, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,840
    Smokey Grindel
    Dec 2, 2006
  2. olivier HARO

    a simple wget in C language?

    olivier HARO, Oct 14, 2004, in forum: C Programming
    Replies:
    10
    Views:
    1,602
    Ilja Booij
    Nov 3, 2004
  3. Yang
    Replies:
    6
    Views:
    3,683
  4. Replies:
    0
    Views:
    93
  5. Replies:
    5
    Views:
    519
    Rajiv Gupta
    Sep 26, 2012
Loading...

Share This Page