Threading NOT working as expected

Discussion in 'Perl Misc' started by Ted, Feb 25, 2008.

  1. Ted

    Ted Guest

    When I first tried creating perl threads, the main process ended after
    the threads where created but before any of them really started. On
    reading further, I saw that I had to join the threads so that the main
    process would sit idle waiting for the threads to finish. So I added
    statements to join each thread. But now, it looks like the
    consequence of this is that the code in each thread is executed one
    after the other, as if it was a single process rather than a set of
    independantly executing threads. I had thought of joining only the
    last created thread, but there is no guarantee that the last thread
    will take the longest time to complete. So how do I create these
    threads and guarantee that they will execute in parallel, and that the
    main process will wait idle until all have finished? I am trying to
    use a script to manage this analysis since there may be, in any given
    batch, several dozen SQL scripts that need to be executed (each is
    independant, of course, with no possibility of interacting with the
    others), and I want to run these scripts by invoking a single perl
    script that allows them to run in parallel making full use of all the
    available computing resources.

    Thanks

    Ted
    Ted, Feb 25, 2008
    #1
    1. Advertising

  2. Ted <> writes:

    > When I first tried creating perl threads, the main process ended after
    > the threads where created but before any of them really started. On
    > reading further, I saw that I had to join the threads so that the main
    > process would sit idle waiting for the threads to finish. So I added
    > statements to join each thread. But now, it looks like the
    > consequence of this is that the code in each thread is executed one
    > after the other, as if it was a single process rather than a set of
    > independantly executing threads.


    It shouldn't.

    #!/usr/local/bin/perl -w
    use strict;
    use threads;

    my @thrds = map { my $i = $_; threads->new(sub {
    print "started $i\n";
    sleep 2;
    print "stopped $i\n" } ) } 0 .. 10;

    $_->join for @thrds;

    output:
    started 0
    started 1
    started 2
    started 3
    started 4
    started 5
    started 6
    started 7
    started 8
    started 9
    started 10
    stopped 0
    stopped 1
    stopped 2
    stopped 3
    stopped 4
    stopped 5
    stopped 6
    stopped 7
    stopped 8
    stopped 9
    stopped 10



    --
    Joost Diepenmaat | blog: http://joost.zeekat.nl/ | work: http://zeekat.nl/
    Joost Diepenmaat, Feb 25, 2008
    #2
    1. Advertising

  3. Ted

    Guest

    Ted <> wrote:
    > When I first tried creating perl threads, the main process ended after
    > the threads where created but before any of them really started. On
    > reading further, I saw that I had to join the threads so that the main
    > process would sit idle waiting for the threads to finish. So I added
    > statements to join each thread. But now, it looks like the
    > consequence of this is that the code in each thread is executed one
    > after the other, as if it was a single process rather than a set of
    > independantly executing threads.


    The problem is in line 42.

    Xho

    --
    -------------------- http://NewsReader.Com/ --------------------
    The costs of publication of this article were defrayed in part by the
    payment of page charges. This article must therefore be hereby marked
    advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
    this fact.
    , Feb 25, 2008
    #3
  4. Ted

    Ted Guest

    On Feb 25, 12:30 pm, wrote:
    > Ted <> wrote:
    > > When I first tried creating perl threads, the main process ended after
    > > the threads where created but before any of them really started.  On
    > > reading further, I saw that I had to join the threads so that the main
    > > process would sit idle waiting for the threads to finish.  So I added
    > > statements to join each thread.  But now, it looks like the
    > > consequence of this is that the code in each thread is executed one
    > > after the other, as if it was a single process rather than a set of
    > > independantly executing threads.

    >
    > The problem is in line 42.
    >
    > Xho
    >
    > --
    > --------------------http://NewsReader.Com/--------------------
    > The costs of publication of this article were defrayed in part by the
    > payment of page charges. This article must therefore be hereby marked
    > advertisement in accordance with 18 U.S.C. Section 1734 solely to indicate
    > this fact.


    Will calling 'system' within a thread block all other threads in the
    process until it has returned? If so, then that may be where my
    problem lies?

    If not, then I am puzzled.

    Thanks

    Ted
    Ted, Feb 25, 2008
    #4
  5. Ted

    Willem Guest

    Ted wrote:
    ) When I first tried creating perl threads, the main process ended after
    ) the threads where created but before any of them really started. On
    ) reading further, I saw that I had to join the threads so that the main
    ) process would sit idle waiting for the threads to finish. So I added
    ) statements to join each thread. But now, it looks like the
    ) consequence of this is that the code in each thread is executed one
    ) after the other, as if it was a single process rather than a set of
    ) independantly executing threads.

    How do you know this ? Have you tested this thoroughly ?

    Note that if you run multiple threads, that one thread will be running
    at a time, and the OS will switch to the next thread every so often.

    So if you do a trivial task in one thread, then, yes, it's very likely that
    the operating system won't have a chance to switch to other tasks before it
    completes, effectively completing one task after another.


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
    Willem, Feb 25, 2008
    #5
  6. Ted

    Ted Guest

    On Feb 25, 2:11 pm, Willem <> wrote:
    > Ted wrote:
    >
    > ) When I first tried creating perl threads, the main process ended after
    > ) the threads where created but before any of them really started.  On
    > ) reading further, I saw that I had to join the threads so that the main
    > ) process would sit idle waiting for the threads to finish.  So I added
    > ) statements to join each thread.  But now, it looks like the
    > ) consequence of this is that the code in each thread is executed one
    > ) after the other, as if it was a single process rather than a set of
    > ) independantly executing threads.
    >
    > How do you know this ?  Have you tested this thoroughly ?
    >
    > Note that if you run multiple threads, that one thread will be running
    > at a time, and the OS will switch to the next thread every so often.
    >
    > So if you do a trivial task in one thread, then, yes, it's very likely that
    > the operating system won't have a chance to switch to other tasks before it
    > completes, effectively completing one task after another.
    >
    > SaSW, Willem
    > --
    > Disclaimer: I am in no way responsible for any of the statements
    >             made in the above text. For all I know I might be
    >             drugged or something..
    >             No I'm not paranoid. You all think I'm paranoid, don't you !
    > #EOT


    Wilem,

    These scripts that are being launched by the perl script threads
    typically take several hours to complete.

    Yes, I have done enough multithreaded programming (in C++ and Java) to
    know that on a single core single processor machine, only one thread
    runs at a time. There is, then, little advantage, for most of my
    programming to do multithreaded development. However, my present
    development machine has a dual core processor, and the server I'm
    working with has a quad core processor. So, on my own machine, two
    threads ought to be running concurrently and on the server, that would
    be four concurrent threads.

    Thanks

    Ted
    Ted, Feb 25, 2008
    #6
  7. Ted

    Willem Guest

    Ted wrote:
    ) Wilem,
    )
    ) These scripts that are being launched by the perl script threads
    ) typically take several hours to complete.
    )
    ) Yes, I have done enough multithreaded programming (in C++ and Java) to
    ) know that on a single core single processor machine, only one thread
    ) runs at a time. There is, then, little advantage, for most of my
    ) programming to do multithreaded development. However, my present
    ) development machine has a dual core processor, and the server I'm
    ) working with has a quad core processor. So, on my own machine, two
    ) threads ought to be running concurrently and on the server, that would
    ) be four concurrent threads.

    Oh, I see. Sorry for jumping to conclusions.

    I'll read through the reast of the thread to see if I have some
    more useful insights.


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
    Willem, Feb 25, 2008
    #7
  8. Ted

    Willem Guest

    Willem wrote:
    ) I'll read through the reast of the thread to see if I have some
    ) more useful insights.

    I see you got your answer already in the other thread. ^_^


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
    Willem, Feb 25, 2008
    #8
  9. Ted

    Ted Zlatanov Guest

    On Mon, 25 Feb 2008 12:05:09 -0800 (PST) Ted <> wrote:

    T> Yes, I have done enough multithreaded programming (in C++ and Java) to
    T> know that on a single core single processor machine, only one thread
    T> runs at a time. There is, then, little advantage, for most of my
    T> programming to do multithreaded development. However, my present
    T> development machine has a dual core processor, and the server I'm
    T> working with has a quad core processor. So, on my own machine, two
    T> threads ought to be running concurrently and on the server, that would
    T> be four concurrent threads.

    This is incorrect. A modern single processor will perform well in a
    multithreaded application. Just because a single thread will run
    doesn't mean a single thread is doing work at any time.

    Most of the time in a modern system is spent waiting for I/O and memory
    access. There are some very special cases where the CPU is actually
    tied up while the application runs, but memory and disk speeds have
    fallen far behind CPU speeds so the CPU will usually be waiting for
    something to happen. This is why modern CPUs have ridiculously large L1
    and L2 caches and many prefetching optimizations.

    In a multithreaded setup, the CPU has a chance to run several threads
    while memory and I/O fetches are happening.

    Equating threads with number of processors is both inefficient and
    misguided. Let the OS worry about scheduling resources, processes, and
    threads. Just write your code to use as many threads as it absolutely
    needs.

    Ted
    Ted Zlatanov, Feb 26, 2008
    #9
  10. On 2008-02-26 16:39, Ted Zlatanov <> wrote:
    > On Mon, 25 Feb 2008 12:05:09 -0800 (PST) Ted <> wrote:

    [number of threads should equal number of CPU cores]

    > This is incorrect. A modern single processor will perform well in a
    > multithreaded application. Just because a single thread will run
    > doesn't mean a single thread is doing work at any time.
    >
    > Most of the time in a modern system is spent waiting for I/O and memory
    > access. There are some very special cases where the CPU is actually
    > tied up while the application runs, but memory and disk speeds have
    > fallen far behind CPU speeds so the CPU will usually be waiting for
    > something to happen. This is why modern CPUs have ridiculously large L1
    > and L2 caches and many prefetching optimizations.
    >
    > In a multithreaded setup, the CPU has a chance to run several threads
    > while memory and I/O fetches are happening.


    Multithreading won't help for memory fetches. Firstly because CPUs don't
    have a way to inform the OS of a slow memory access, and secondly
    because the overhead of switching to a different thread would be much
    too high for such a (relatively) short wait. There is one exception:
    So-called multi-threading CPUs can keep the state of a fixed (and
    usually low) number of thrads on the CPU and switch between them. But
    these are really just multi-core CPUs which share some of their units.

    You are completely right about I/O of course.

    hp
    Peter J. Holzer, Mar 1, 2008
    #10
  11. On Sat, 01 Mar 2008 17:04:08 +0100, Peter J. Holzer wrote:

    > On 2008-02-26 16:39, Ted Zlatanov <> wrote:
    >> On Mon, 25 Feb 2008 12:05:09 -0800 (PST) Ted <>
    >> wrote:

    > [number of threads should equal number of CPU cores]
    >
    >> This is incorrect. A modern single processor will perform well in a
    >> multithreaded application. Just because a single thread will run
    >> doesn't mean a single thread is doing work at any time.
    >>
    >> Most of the time in a modern system is spent waiting for I/O and memory
    >> access. There are some very special cases where the CPU is actually
    >> tied up while the application runs, but memory and disk speeds have
    >> fallen far behind CPU speeds so the CPU will usually be waiting for
    >> something to happen. This is why modern CPUs have ridiculously large
    >> L1 and L2 caches and many prefetching optimizations.
    >>
    >> In a multithreaded setup, the CPU has a chance to run several threads
    >> while memory and I/O fetches are happening.

    >
    > Multithreading won't help for memory fetches. Firstly because CPUs don't
    > have a way to inform the OS of a slow memory access, and secondly
    > because the overhead of switching to a different thread would be much
    > too high for such a (relatively) short wait. There is one exception:
    > So-called multi-threading CPUs can keep the state of a fixed (and
    > usually low) number of thrads on the CPU and switch between them. But
    > these are really just multi-core CPUs which share some of their units.
    >
    > You are completely right about I/O of course.


    Not even. If all those I/Os are going to the same disk, you run the risk
    of thrashing, and overall performance goes down instead of up.

    Exactly the same behaviour can be seen with processes. Suppose you have a
    bunch of files, together much larger than available memory. These files
    are input to a program that handles one file and writes another output
    file. You can do either:

    for f in *; do program "$f" "$f.out"&; done; wait

    or

    for f in *; do program "$f" "$f.out"; done;

    If the program is I/O bound, I expect the second version to be faster
    than the first, although it depends on a lot of things.

    So think, design, and profile, profile, profile.

    M4
    Martijn Lievaart, Mar 1, 2008
    #11
  12. On 2008-03-01 17:09, Martijn Lievaart <> wrote:
    > On Sat, 01 Mar 2008 17:04:08 +0100, Peter J. Holzer wrote:
    >
    >> On 2008-02-26 16:39, Ted Zlatanov <> wrote:
    >>> On Mon, 25 Feb 2008 12:05:09 -0800 (PST) Ted <>
    >>> wrote:

    >> [number of threads should equal number of CPU cores]
    >>
    >>> This is incorrect. A modern single processor will perform well in a
    >>> multithreaded application. Just because a single thread will run
    >>> doesn't mean a single thread is doing work at any time.

    [...]
    >>> In a multithreaded setup, the CPU has a chance to run several threads
    >>> while memory and I/O fetches are happening.

    >>
    >> Multithreading won't help for memory fetches.

    [...]
    >> You are completely right about I/O of course.

    >
    > Not even. If all those I/Os are going to the same disk, you run the risk
    > of thrashing, and overall performance goes down instead of up.


    Of course. There are few problems which can be parallelized infinitely.
    At some point further parallelization degrades performance instead of
    improving it.

    > Exactly the same behaviour can be seen with processes. Suppose you have a
    > bunch of files, together much larger than available memory. These files
    > are input to a program that handles one file and writes another output
    > file. You can do either:
    >
    > for f in *; do program "$f" "$f.out"&; done; wait
    >
    > or
    >
    > for f in *; do program "$f" "$f.out"; done;
    >
    > If the program is I/O bound, I expect the second version to be faster
    > than the first, although it depends on a lot of things.


    One of the things it depends on is size and placement of the files on
    disk. If the files are large and stored (mostly) contiguously, the
    second version is almost certainly faster. But if they are small and
    scattered all over the disk, the first version may be faster because it
    allows the kernel (or even the disk) to decide on the order in which it
    reads these files.

    > So think, design, and profile, profile, profile.


    Full ack.

    hp
    Peter J. Holzer, Mar 1, 2008
    #12
  13. Ted

    Ted Zlatanov Guest

    On Sat, 1 Mar 2008 18:09:32 +0100 Martijn Lievaart <> wrote:

    ML> On Sat, 01 Mar 2008 17:04:08 +0100, Peter J. Holzer wrote:
    >> On 2008-02-26 16:39, Ted Zlatanov <> wrote:
    >>> In a multithreaded setup, the CPU has a chance to run several threads
    >>> while memory and I/O fetches are happening.

    >>
    >> Multithreading won't help for memory fetches. Firstly because CPUs don't
    >> have a way to inform the OS of a slow memory access, and secondly
    >> because the overhead of switching to a different thread would be much
    >> too high for such a (relatively) short wait. There is one exception:
    >> So-called multi-threading CPUs can keep the state of a fixed (and
    >> usually low) number of thrads on the CPU and switch between them. But
    >> these are really just multi-core CPUs which share some of their units.
    >>
    >> You are completely right about I/O of course.


    ML> Not even. If all those I/Os are going to the same disk, you run the risk
    ML> of thrashing, and overall performance goes down instead of up.

    That's very application-dependent. Note my original statement: the CPU
    has a chance to run several threads while fetches are happening. It
    doesn't mean that I/O will work better than way, but that's not a
    multithreading problem. The elevator algorithm introduced fairly
    recently in Linux kernels (IIRC) addresses this kind of I/O contention
    in the right place, outside the application.

    Ted
    Ted Zlatanov, Mar 3, 2008
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Richard Huff
    Replies:
    2
    Views:
    390
    Richard Huff
    Jan 6, 2004
  2. Natty Gur
    Replies:
    1
    Views:
    1,881
    Marshal Antony
    Mar 3, 2004
  3. Replies:
    9
    Views:
    1,014
    Mark Space
    Dec 29, 2007
  4. Steven Woody
    Replies:
    0
    Views:
    394
    Steven Woody
    Jan 9, 2009
  5. Steven Woody
    Replies:
    0
    Views:
    435
    Steven Woody
    Jan 9, 2009
Loading...

Share This Page