LWP hangs

Discussion in 'Perl Misc' started by yahavba@gmail.com, Mar 22, 2007.

  1. Guest

    Hi,
    I'm using LWP on win32. Sometimes after a period of successfull
    communication, the perl process just hangs, and it seems like LWP
    stopped. The debug messages are:

    LWP::UserAgent::_need_proxy: Not proxied
    LWP::protocol::http::request: ()

    in this stage the script stalls and i have to stop it using task
    manager.

    does anyone know why this might happen?

    I'd appreciate you help!

    Thanks
    , Mar 22, 2007
    #1
    1. Advertising

  2. Jamie Guest

    In <>,
    mentions:
    >Hi,
    >I'm using LWP on win32. Sometimes after a period of successfull
    >communication, the perl process just hangs, and it seems like LWP
    >stopped. The debug messages are:
    >
    >LWP::UserAgent::_need_proxy: Not proxied
    >LWP::protocol::http::request: ()
    >
    >in this stage the script stalls and i have to stop it using task
    >manager.
    >
    >does anyone know why this might happen?


    Is it going through a proxy?

    Can you tell us where/which method it's hanging on? Sometimes
    I'll override the methods (get/post/etc..) and have it spit out
    the URL and the parameters it's trying to use. Then I'll go in with
    telnet and try to mimick what it would/should do, sort of working
    around from there. The key is in figuring out if it's LWP or if
    it's something to do with the underlying network.

    Course, if it's SSL, things get a little tricky..

    Jamie
    --
    http://www.geniegate.com Custom web programming
    Perl * Java * UNIX User Management Solutions
    Jamie, Mar 22, 2007
    #2
    1. Advertising

  3. Guest

    On Mar 22, 9:18 pm, (Jamie) wrote:
    > In <>,
    > mentions:
    >
    > >Hi,
    > >I'm usingLWPonwin32. Sometimes after a period of successfull
    > >communication, the perl process justhangs, and it seems likeLWP
    > >stopped. The debug messages are:

    >
    > >LWP::UserAgent::_need_proxy: Not proxied
    > >LWP::protocol::http::request: ()

    >
    > >in this stage the script stalls and i have to stop it using task
    > >manager.

    >
    > >does anyone know why this might happen?

    >
    > Is it going through a proxy?
    >
    > Can you tell us where/which method it's hanging on? Sometimes
    > I'll override the methods (get/post/etc..) and have it spit out
    > the URL and the parameters it's trying to use. Then I'll go in with
    > telnet and try to mimick what it would/should do, sort of working
    > around from there. The key is in figuring out if it'sLWPor if
    > it's something to do with the underlying network.
    >
    > Course, if it's SSL, things get a little tricky..
    >
    > Jamie
    > --http://www.geniegate.com Custom web programming
    > Perl * Java * UNIX User Management Solutions


    Hi,
    It's hanging on get. I'm using get on a url which is secured (https),
    and it works for quite a long time untill suddenly it stops and
    hangs.
    When you say you override the method - how exactly is it done? how can
    i verify what parameters it is trying to use?

    thanks for you help!
    , Mar 22, 2007
    #3
  4. Guest

    On Mar 22, 9:18 pm, (Jamie) wrote:
    > In <>,
    > mentions:
    >
    > >Hi,
    > >I'm usingLWPonwin32. Sometimes after a period of successfull
    > >communication, the perl process justhangs, and it seems likeLWP
    > >stopped. The debug messages are:

    >
    > >LWP::UserAgent::_need_proxy: Not proxied
    > >LWP::protocol::http::request: ()

    >
    > >in this stage the script stalls and i have to stop it using task
    > >manager.

    >
    > >does anyone know why this might happen?

    >
    > Is it going through a proxy?
    >
    > Can you tell us where/which method it's hanging on? Sometimes
    > I'll override the methods (get/post/etc..) and have it spit out
    > the URL and the parameters it's trying to use. Then I'll go in with
    > telnet and try to mimick what it would/should do, sort of working
    > around from there. The key is in figuring out if it'sLWPor if
    > it's something to do with the underlying network.
    >
    > Course, if it's SSL, things get a little tricky..
    >
    > Jamie
    > --http://www.geniegate.com Custom web programming
    > Perl * Java * UNIX User Management Solutions


    Also, it's not going through a proxy (and it is using SSL)
    thanks!
    , Mar 22, 2007
    #4
  5. Jamie Guest

    In <>,
    mentions:
    >On Mar 22, 9:18 pm, (Jamie) wrote:
    >> --http://www.geniegate.com Custom web programming
    >> Perl * Java * UNIX User Management Solutions

    >
    >Hi,
    >It's hanging on get. I'm using get on a url which is secured (https),
    >and it works for quite a long time untill suddenly it stops and
    >hangs.
    >When you say you override the method - how exactly is it done? how can
    >i verify what parameters it is trying to use?
    >
    >thanks for you help!


    Do this: perldoc -m LWP::UserAgent

    It'll give you the source code for LWP::UserAgent.

    Then, in a sub or another package or a variety of ways..

    NOTE:!!!!! Not-tested code, this is just a "for example" thing!

    I'll probably goof this up, I'm editing "live" so beware...

    sub get_ua {
    {
    package My::Ua;
    use LWP::UserAgent;
    use base 'LWP::UserAgent';
    use strict;

    # Use our bugged version to snoop in on things.
    sub get {
    my($self,@args) = @_;
    print "CP1: $self called with " . join(',',@args), "\n";
    my $rv = $self->SUPER::get(@args);
    print "CP2: returning from get\n";
    return($rv);
    }
    }
    return(My::Ua->new(@_)); # Create our own version of LWP::UserAgent.
    }


    When you construct your LWP::UserAgent object, call get_ua() instead, in
    the customized get() method above, you can insert print statements and
    so on which will tell you the precise URL it's attempting to fetch. (you
    can make a note of the URL's and observe if it's always the same URL,
    this would be a key piece of information)

    Confirm things are as they should, then follow along the path of LWP
    until you get to request() (and at that point.. it's probably just
    as easy to copy the whole thing over and pollute with print statements)
    placing "CPnnn" statements in along the way.

    At the end of it all, you'll get to a point where there isn't a "CPnnn"
    printed where you think there ought to be one. At that point, you'll
    have found exactly where it's hanging, and, if you're lucky.. it'll
    be something obvious. :) If not, at least you'll have a good idea what's
    wrong.

    Do NOT modify the source of LWP::UserAgent (or any other module for that
    matter) directly, always copy, or if it's more convenient, do a
    custom override as above. Otherwise you'll end up with corrupt modules.

    See Also: LWP::Debug

    Though I've never used it, every problem I've ever had was as a result of me
    passing the wrong stuff into get/post, I've never had to go further than what
    I've described above. (and I usually override LWP::UserAgent in the beginning
    anyway, just in case I might want to change it's behavior later on in
    program development, ex: password fetching)

    The above is just a debugging method I've found useful for "tough cases".

    Jamie
    --
    http://www.geniegate.com Custom web programming
    Perl * Java * UNIX User Management Solutions
    Jamie, Mar 23, 2007
    #5
  6. Guest

    On Mar 23, 3:32 am, (Jamie) wrote:
    > In <>,
    > mentions:
    >
    > >On Mar 22, 9:18 pm, (Jamie) wrote:
    > >> --http://www.geniegate.com Custom web programming
    > >> Perl * Java * UNIX User Management Solutions

    >
    > >Hi,
    > >It's hanging on get. I'm using get on a url which is secured (https),
    > >and it works for quite a long time untill suddenly it stops and
    > >hangs.
    > >When you say you override the method - how exactly is it done? how can
    > >i verify what parameters it is trying to use?

    >
    > >thanks for you help!

    >
    > Do this: perldoc -mLWP::UserAgent
    >
    > It'll give you the source code forLWP::UserAgent.
    >
    > Then, in a sub or another package or a variety of ways..
    >
    > NOTE:!!!!! Not-tested code, this is just a "for example" thing!
    >
    > I'll probably goof this up, I'm editing "live" so beware...
    >
    > sub get_ua {
    > {
    > package My::Ua;
    > useLWP::UserAgent;
    > use base 'LWP::UserAgent';
    > use strict;
    >
    > # Use our bugged version to snoop in on things.
    > sub get {
    > my($self,@args) = @_;
    > print "CP1: $self called with " . join(',',@args), "\n";
    > my $rv = $self->SUPER::get(@args);
    > print "CP2: returning from get\n";
    > return($rv);
    > }
    > }
    > return(My::Ua->new(@_)); # Create our own version ofLWP::UserAgent.
    >
    > }
    >
    > When you construct yourLWP::UserAgent object, call get_ua() instead, in
    > the customized get() method above, you can insert print statements and
    > so on which will tell you the precise URL it's attempting to fetch. (you
    > can make a note of the URL's and observe if it's always the same URL,
    > this would be a key piece of information)
    >
    > Confirm things are as they should, then follow along the path ofLWP
    > until you get to request() (and at that point.. it's probably just
    > as easy to copy the whole thing over and pollute with print statements)
    > placing "CPnnn" statements in along the way.
    >
    > At the end of it all, you'll get to a point where there isn't a "CPnnn"
    > printed where you think there ought to be one. At that point, you'll
    > have found exactly where it's hanging, and, if you're lucky.. it'll
    > be something obvious. :) If not, at least you'll have a good idea what's
    > wrong.
    >
    > Do NOT modify the source ofLWP::UserAgent (or any other module for that
    > matter) directly, always copy, or if it's more convenient, do a
    > custom override as above. Otherwise you'll end up with corrupt modules.
    >
    > See Also:LWP::Debug
    >
    > Though I've never used it, every problem I've ever had was as a result of me
    > passing the wrong stuff into get/post, I've never had to go further than what
    > I've described above. (and I usually overrideLWP::UserAgent in the beginning
    > anyway, just in case I might want to change it's behavior later on in
    > program development, ex: password fetching)
    >
    > The above is just a debugging method I've found useful for "tough cases".
    >
    > Jamie
    > --http://www.geniegate.com Custom web programming
    > Perl * Java * UNIX User Management Solutions


    Hi Jamie,
    thanks a lot for your help and advice.
    i've tried your solution, and i see that the GET actually receive the
    correct URL. Afterwards, usually after fetching pages for 1-2 hours,
    it hangs. I tried to use "alarm" of 60 seconds (and mapped SIG{ALRM}
    to a subroutine of my own) but it didn't help, even ctrl-C doesn't
    kill the process - only Task Manager kill.
    I'm thinking of another way of running the GET commands in a seperate
    process or thread, and then if i can't see the results of the GET in
    the main process i will kill this thread, what do you think?
    , Mar 28, 2007
    #6
  7. Jamie Guest

    In <>,
    mentions:
    >On Mar 23, 3:32 am, (Jamie) wrote:
    >i've tried your solution, and i see that the GET actually receive the
    >correct URL. Afterwards, usually after fetching pages for 1-2 hours,
    >it hangs. I tried to use "alarm" of 60 seconds (and mapped SIG{ALRM}
    >to a subroutine of my own) but it didn't help, even ctrl-C doesn't
    >kill the process - only Task Manager kill.


    Does it hang on the exact URL each time? The ^C sort of baffles
    me. With UNIX, I would probably examine the process and see if it's
    taking a lot of memory (even then, ^C should work)

    >I'm thinking of another way of running the GET commands in a seperate
    >process or thread, and then if i can't see the results of the GET in
    >the main process i will kill this thread, what do you think?


    I suppose that would work, or, fork a new process for each URL, wait
    and then process it, then fork another process each time you fetch
    a URL. The hack being, keep resource allocations in a child proc where
    they can be cleaned up on exit.

    Long running processes are sort of famous for memory leaks. (usually they get
    progressively slower and slower and eventually just don't work / memory errors)

    The "right way" (IMO) is to find out whats happening though. (this can be
    really hard to do. Data::Dumper combined with UNIVERSAL::DESTROY will sometimes
    help, but.. it's just not easy)

    Doing a "fork()" is a cheap way around the problem, when the child process
    dies (at least with unix) the memory is reclaimed. It's more of a band-aid
    than a solution though. (useful if you need to do something you /know/ will
    take a lot of memory, it's the only way I know of to give it back when
    done)

    If it's practical, you might take just the part that GET's the URL, without
    any other code, run that in a loop and see if it hangs. That might let
    you know if it's a problem with LWP or the rest of your code is doing
    something that doesn't cause a problem until the GET.

    I don't know enough about windows to understand the rest of the story,
    could be most anything.. sockets not being closed? handles? collecting
    a boatload of UserAgent objects some place?


    Jamie
    --
    http://www.geniegate.com Custom web programming
    Perl * Java * UNIX User Management Solutions
    Jamie, Mar 28, 2007
    #7
  8. Guest

    On Mar 28, 11:45 pm, (Jamie) wrote:
    > In <>,
    > mentions:
    >
    > >On Mar 23, 3:32 am, (Jamie) wrote:
    > >i've tried your solution, and i see that theGETactually receive the
    > >correct URL. Afterwards, usually after fetching pages for 1-2 hours,
    > >it hangs. I tried to use "alarm" of 60 seconds (and mapped SIG{ALRM}
    > >to a subroutine of my own) but it didn't help, even ctrl-C doesn't
    > >kill the process - only Task Manager kill.

    >
    > Does ithangon the exact URL each time? The ^C sort of baffles
    > me. With UNIX, I would probably examine the process and see if it's
    > taking a lot of memory (even then, ^C should work)
    >
    > >I'm thinking of another way of running theGETcommands in a seperate
    > >process or thread, and then if i can't see the results of theGETin
    > >the main process i will kill this thread, what do you think?

    >
    > I suppose that would work, or, fork a new process for each URL, wait
    > and then process it, then fork another process each time you fetch
    > a URL. The hack being, keep resource allocations in a child proc where
    > they can be cleaned up on exit.
    >
    > Long running processes are sort of famous for memory leaks. (usually theyget
    > progressively slower and slower and eventually just don't work / memory errors)
    >
    > The "right way" (IMO) is to find out whats happening though. (this can be
    > really hard to do. Data::Dumper combined with UNIVERSAL::DESTROY will sometimes
    > help, but.. it's just not easy)
    >
    > Doing a "fork()" is a cheap way around the problem, when the child process
    > dies (at least with unix) the memory is reclaimed. It's more of a band-aid
    > than a solution though. (useful if you need to do something you /know/ will
    > take a lot of memory, it's the only way I know of to give it back when
    > done)
    >
    > If it's practical, you might take just the part thatGET'sthe URL, without
    > any other code, run that in a loop and see if it hangs. That might let
    > you know if it's a problem withLWPor the rest of your code is doing
    > something that doesn't cause a problem until theGET.
    >
    > I don't know enough about windows to understand the rest of the story,
    > could be most anything.. sockets not being closed? handles? collecting
    > a boatload of UserAgent objects some place?
    >
    > Jamie
    > --http://www.geniegate.com Custom web programming
    > Perl * Java * UNIX User Management Solutions


    Hi Jamie,

    I noticed that the perl process gets more and more memory (up to 300M
    and more) over time. i did some investigation on the web and found out
    that this might happen because the Mechanize object saves each visited
    page so that the "back()" procedure will be possible. i know suspect
    that this might cause the trouble - and not other issues. I'm trying
    my best to find out how to overcome this - till now i haven't found
    any way to disable this page saving.
    did you came across such a behavior of mech?
    since i'm logged in to the website, it's not reasonable for me to re-
    create the object each time...
    have to keep on thinking about it.
    happy holiday and thanks!
    , Apr 10, 2007
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. cp
    Replies:
    3
    Views:
    6,046
  2. Tommy
    Replies:
    0
    Views:
    1,038
    Tommy
    Sep 22, 2003
  3. Reto Zingg
    Replies:
    0
    Views:
    1,215
    Reto Zingg
    Sep 28, 2003
  4. La Jesus
    Replies:
    9
    Views:
    1,313
    Gunnar Hjalmarsson
    Oct 27, 2003
  5. Tim Shadel

    Gem hangs => TCPSocket.write hangs

    Tim Shadel, Jul 23, 2005, in forum: Ruby
    Replies:
    1
    Views:
    382
    Ville Mattila
    Jul 24, 2005
Loading...

Share This Page