LWP and threads: anything to look out for?

Discussion in 'Perl Misc' started by John Bokma, Oct 12, 2005.

  1. John Bokma

    John Bokma Guest

    I want to move from Parallell UserAgent to using threads. Are there things
    I should be aware of, or is this a piece of cake?

    (Another option might be forking since IIRC Windows does this using
    threads).

    --
    John Small Perl scripts: http://johnbokma.com/perl/
    Perl programmer available: http://castleamber.com/
    I ploink googlegroups.com :)
     
    John Bokma, Oct 12, 2005
    #1
    1. Advertising

  2. John Bokma

    zentara Guest

    On 12 Oct 2005 06:53:26 GMT, John Bokma <> wrote:

    >I want to move from Parallell UserAgent to using threads. Are there things
    >I should be aware of, or is this a piece of cake?
    >
    >(Another option might be forking since IIRC Windows does this using
    >threads).


    The only good reason to use threads, is if you want an easy way to share
    realtime data between threads. Things such as progress data for a
    progress indicator, or when you want to regex the downloaded results,
    and do something in your main program, based on the regex.

    Otherwise it is probably more efficient and easier to just fork. There
    is Parallel::ForkManager.

    But threads are pretty easy once you get the hang of the little details.
    One thing to watch out for, is if you run any "exec" from any thread, it
    will kill and replace all running threads with the exec'd code.



    --
    I'm not really a human, but I play one on earth.
    http://zentara.net/japh.html
     
    zentara, Oct 12, 2005
    #2
    1. Advertising

  3. John Bokma

    Guest

    John Bokma <> wrote:
    > I want to move from Parallell UserAgent to using threads.


    Why?

    > Are there
    > things I should be aware of, or is this a piece of cake?


    I don't think you will have a problem specific to LWP as long you don't try
    to share/clone LWP objects across threads. Of course, threaded programming
    in general is more difficult than non-threaded, so I wouldn't say it will
    be a piece of cake, unless you are already experienced in threaded
    programming.

    > (Another option might be forking since IIRC Windows does this using
    > threads).


    Maybe forking is better. I prefer it over threading in most circumstances.
    (It seems to be a theme this week). But it is hard to tell without knowing
    more about what you are trying to do.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Oct 12, 2005
    #3
  4. John Bokma

    John Bokma Guest

    wrote:

    > John Bokma <> wrote:
    >> I want to move from Parallell UserAgent to using threads.

    >
    > Why?


    Good question :) I think that ParallellUA = UA + threading, and doesn't
    add anything, and since UA is more a core module, I prefer the latter.

    Also, with ParallellUA the documentation was a bit unclear to me.

    >> Are there
    >> things I should be aware of, or is this a piece of cake?

    >
    > I don't think you will have a problem specific to LWP as long you
    > don't try to share/clone LWP objects across threads. Of course,
    > threaded programming in general is more difficult than non-threaded,
    > so I wouldn't say it will be a piece of cake, unless you are already
    > experienced in threaded programming.


    Java, and enough CGI experience :) (sharing resources, locking, etc).

    >> (Another option might be forking since IIRC Windows does this using
    >> threads).

    >
    > Maybe forking is better. I prefer it over threading in most
    > circumstances. (It seems to be a theme this week). But it is hard to
    > tell without knowing more about what you are trying to do.


    I want to have n workers in parallell, each getting a request from a
    Queue, fetching the page, storing the result, and next. Sleeping (not
    wasting CPU cycles) in between each fetch.

    --
    John Small Perl scripts: http://johnbokma.com/perl/
    Perl programmer available: http://castleamber.com/
    I ploink googlegroups.com :)
     
    John Bokma, Oct 12, 2005
    #4
  5. John Bokma

    Guest

    John Bokma <> wrote:
    > wrote:
    >
    > > John Bokma <> wrote:
    > >> I want to move from Parallell UserAgent to using threads.

    > >
    > > Why?

    >
    > Good question :) I think that ParallellUA = UA + threading, and doesn't
    > add anything, and since UA is more a core module, I prefer the latter.


    I think ParallelUA = UA + nonblocking IO, rather than threading. Assuming
    it is well implemented (I haven't used ParallelUA enough to know), I think
    non-blocking IO is better than threads for this task.

    > Also, with ParallellUA the documentation was a bit unclear to me.


    OK, fair enough. Anything in particular you found unclear?

    >
    > >> (Another option might be forking since IIRC Windows does this using
    > >> threads).

    > >
    > > Maybe forking is better. I prefer it over threading in most
    > > circumstances. (It seems to be a theme this week). But it is hard to
    > > tell without knowing more about what you are trying to do.

    >
    > I want to have n workers in parallell, each getting a request from a
    > Queue, fetching the page, storing the result, and next.


    Store the results on the filesystem or DB, or in Perl memory?

    Is the queue dynamically added to (based on the results returned from
    earlier tasks in the queue) or is it built in a start-up phase and then
    only consumed from then on?

    If the queue is dynamically added to, that argues for threads. If each
    page-fetch takes less than 1/20 of a second or so (and there are tens of
    thousands of them), that argues for threads, (althought I might instead
    just batch them up into chunks of several page fetches) . Otherwise, I'd
    go with forking with Parallel::ForkManager. (or ParallelUA :) ).

    > Sleeping (not
    > wasting CPU cycles) in between each fetch.


    This part I'm not sure of. Why sleep rather than just fetch the next
    item from the queue? Are you sleeping only in the case of an empty queue
    (which of course only makes sense if the queue is dynamic)? Or to avoid
    overloading the remote server(s) you are fetching from?

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    Usenet Newsgroup Service $9.95/Month 30GB
     
    , Oct 12, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. inhahe
    Replies:
    3
    Views:
    2,492
    Diez B. Roggisch
    Jan 28, 2005
  2. Olli Plough
    Replies:
    3
    Views:
    304
    Patrick May
    Mar 13, 2007
  3. unreal

    LWP::UserAgent & threads problem

    unreal, Apr 30, 2004, in forum: Perl Misc
    Replies:
    0
    Views:
    104
    unreal
    Apr 30, 2004
  4. Replies:
    4
    Views:
    202
  5. Replies:
    9
    Views:
    151
    cwdjrxyz
    Jun 13, 2006
Loading...

Share This Page