How many threads?

Discussion in 'Java' started by Roedy Green, Jul 31, 2008.

  1. Roedy Green

    Roedy Green Guest

    I am about to do some multithreading code to get some parallelism when
    waiting for Internet links to respond. The question comes How many
    threads? Well it depend on too many things. How much ram, how fast the
    connections, what else is going on in the machine.

    The optimal choice could change over time, even within a single run.

    What I need is some sort of homing mechanism that homes in on the
    optimal number. Surely this is a common problem. Are there classes to
    handle this automatically?
    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Jul 31, 2008
    #1
    1. Advertising

  2. Roedy Green

    shakah Guest

    On Jul 30, 7:28 pm, Roedy Green <>
    wrote:
    > I am about to do some multithreading code to get some parallelism when
    > waiting for Internet links to respond. The question comes How many
    > threads? Well it depend on too many things. How much ram, how fast the
    > connections, what else is going on in the machine.
    >
    > The optimal choice could change over time, even within a single run.
    >
    > What I need is some sort of homing mechanism that homes in on the
    > optimal number. Surely this is a common problem. Are there classes to
    > handle this automatically?
    > --
    >
    > Roedy Green Canadian Mind Products
    > The Java Glossaryhttp://mindprod.com


    I think that your problem calls for a blocking-queue/thread-pool
    solution, with possibly an upper-bound to the number of threads --
    IMHO, you're unlikely to exhaust a local resource (e.g. CPU or memory,
    assuming a reasonably up-to-date box) given the amount of idle time
    you'll encounter waiting for I/O while making asynch requests over the
    Internet.
     
    shakah, Jul 31, 2008
    #2
    1. Advertising

  3. Roedy Green

    Mark Space Guest

    shakah wrote:
    > On Jul 30, 7:28 pm, Roedy Green <>
    > wrote:


    >> The optimal choice could change over time, even within a single run.
    >>


    >
    > I think that your problem calls for a blocking-queue/thread-pool
    > solution, with possibly an upper-bound to the number of threads --
    > IMHO, you're unlikely to exhaust a local resource (e.g. CPU or memory,
    > assuming a reasonably up-to-date box) given the amount of idle time
    > you'll encounter waiting for I/O while making asynch requests over the
    > Internet.


    Besides making use of the API in java.util.concurrent, I think you've
    identified one of the requirements yourself in the line I quoted above:
    change.

    Think about designing for change, and which design patterns support high
    change situations. The Strategy Pattern, for example. If you make your
    "threading strategy" something that can be configured on the fly, then I
    think you might have more ability to change things as the code (and the
    user requirements) evolve.

    You can start off with an ExecutorService from java.util.concurrent like
    ThreadPoolExecutor, and then as you learn more substitute your own
    custom, well-tuned implementation.
     
    Mark Space, Jul 31, 2008
    #3
  4. Roedy Green

    Arne Vajhøj Guest

    Roedy Green wrote:
    > I am about to do some multithreading code to get some parallelism when
    > waiting for Internet links to respond. The question comes How many
    > threads? Well it depend on too many things. How much ram, how fast the
    > connections, what else is going on in the machine.
    >
    > The optimal choice could change over time, even within a single run.
    >
    > What I need is some sort of homing mechanism that homes in on the
    > optimal number. Surely this is a common problem. Are there classes to
    > handle this automatically?


    I think standard practice is no make it configurable to allow
    the admin to test and configure what is optimal for that site.

    That is of course a Java EE way of thinking.

    For your app where the actually work is relative small compared
    to how long it will block, then you should go for a pretty
    big number of threads. As many as the OS will still perform
    well with.

    Arne
     
    Arne Vajhøj, Aug 1, 2008
    #4
  5. Roedy Green

    Mark Space Guest

    Arne Vajhøj wrote:

    >
    > I think standard practice is no make it configurable to allow
    > the admin to test and configure what is optimal for that site.
    >
    > That is of course a Java EE way of thinking.


    I don't think it's just "Java EE". I've seen C++ programmers go through
    complex gyrations to implement something similar to Java's reflection in
    C++, just because the Strategy Pattern and Data Driven Design are that
    important. It something that should be considered for almost every
    design. Change is the most fundamental, common aspect of all software
    engineering, after all.
     
    Mark Space, Aug 1, 2008
    #5
  6. Roedy Green

    Arne Vajhøj Guest

    Mark Space wrote:
    > Arne Vajhøj wrote:
    >> I think standard practice is no make it configurable to allow
    >> the admin to test and configure what is optimal for that site.
    >>
    >> That is of course a Java EE way of thinking.

    >
    > I don't think it's just "Java EE". I've seen C++ programmers go through
    > complex gyrations to implement something similar to Java's reflection in
    > C++, just because the Strategy Pattern and Data Driven Design are that
    > important. It something that should be considered for almost every
    > design. Change is the most fundamental, common aspect of all software
    > engineering, after all.


    It is not just Java EE, but it requires a context where there are
    somebody that knows about config files, threads and performance testing.

    For a Java EE server there should be somebody that knows that.

    For the desktop app used by Joe Average in Smalltown it is often
    not the case.

    Not that Java has much market share in that space anyway.

    Arne
     
    Arne Vajhøj, Aug 1, 2008
    #6
  7. Roedy Green wrote:
    > I am about to do some multithreading code to get some parallelism when
    > waiting for Internet links to respond. The question comes How many
    > threads? Well it depend on too many things. How much ram, how fast the
    > connections, what else is going on in the machine.
    >
    > The optimal choice could change over time, even within a single run.
    >
    > What I need is some sort of homing mechanism that homes in on the
    > optimal number. Surely this is a common problem. Are there classes to
    > handle this automatically?


    You you really need many threads? How about using nio and poll to have a
    single thread manage many connections at once? This way it could be
    possible just enqueue the request you need to be done and handle many of
    them in parallel with a single (or few) threads.

    But hundreds is often no problem. I have written a engine that emulates
    user requests against a server. A "user" is one of many javascript that
    can do (among other things) open(url), openLink(id), openLink(name),
    sleep(time) etc. Assuming the server takes tens to hundreds of
    milliseconds to respond I can usually run several hundreds of threds at
    once. Thousands of threads seems to be fine if the response time goes
    above one or a few seconds.

    Note: I have run this on Dual and Quad core Intel and AMD on Linux, not
    Windows, so there could be a difference.

    --
    Roger Lindsjö
     
    Roger Lindsjö, Aug 3, 2008
    #7
  8. Roedy Green

    Tom Anderson Guest

    On Sun, 3 Aug 2008, Roger Lindsjö wrote:

    > Roedy Green wrote:
    >> I am about to do some multithreading code to get some parallelism when
    >> waiting for Internet links to respond. The question comes How many
    >> threads? Well it depend on too many things. How much ram, how fast the
    >> connections, what else is going on in the machine.

    >
    > But hundreds is often no problem.
    >
    > Note: I have run this on Dual and Quad core Intel and AMD on Linux, not
    > Windows, so there could be a difference.


    I believe that linux's threading is much, much faster than windows'. But a
    few hundred threads could still be fine on windows.

    tom

    --
    Computation is the basis of all life
     
    Tom Anderson, Aug 4, 2008
    #8
  9. Roedy Green

    Christian Guest

    Tom Anderson schrieb:
    > On Sun, 3 Aug 2008, Roger Lindsjö wrote:
    >
    >> Roedy Green wrote:
    >>> I am about to do some multithreading code to get some parallelism when
    >>> waiting for Internet links to respond. The question comes How many
    >>> threads? Well it depend on too many things. How much ram, how fast the
    >>> connections, what else is going on in the machine.

    >>
    >> But hundreds is often no problem.
    >>
    >> Note: I have run this on Dual and Quad core Intel and AMD on Linux,
    >> not Windows, so there could be a difference.

    >
    > I believe that linux's threading is much, much faster than windows'. But
    > a few hundred threads could still be fine on windows.
    >
    > tom
    >


    Does this believe have any base? Or is it just the usual Unix/Windows flame?
     
    Christian, Aug 4, 2008
    #9
  10. Roedy Green

    Roedy Green Guest

    On Sun, 03 Aug 2008 20:55:17 +0200, Roger Lindsjö
    <> wrote, quoted or indirectly quoted
    someone who said :

    >You you really need many threads? How about using nio and poll to have a
    >single thread manage many connections at once? This way it could be
    >possible just enqueue the request you need to be done and handle many of
    >them in parallel with a single (or few) threads.


    There are two issues:

    1. the general one of multithread apps that don't know the optimal
    number of threads, and would like some general mechanism to optimise
    it dynamically to adjust for changing conditions, e.G. other tasks
    running in the same machine soaking up RAM/CPU. The idea is your would
    write some custom routine that measured productivity. e.G.
    transactions/second, bytes processed per minute etc. The manager would
    raise and lower the max threads and collect data, and try to fit it
    say to a Cheyenne polynomial to try to predict the optimal value. You
    would weight your readings to give more recent ones more weight in the
    polynomial evaluation. It might just up the count and down the count
    by one every so often just to notice if conditions are changing.


    2. my specific one of improving my back end to Xenu link checker that
    rechecks links in a clever way to avoid needless work. Eventually I
    hope to get rid of Xenu entirely. It is too many flaws.
    I am sending HEAD and GET posts out to the Internet mainly to see the
    status code that comes back. I gather this nio approach works at the
    socket level or lower, rather than HTTP.

    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Aug 4, 2008
    #10
  11. Roedy Green

    Tom Anderson Guest

    On Mon, 4 Aug 2008, Christian wrote:

    > Tom Anderson schrieb:
    >> On Sun, 3 Aug 2008, Roger Lindsjö wrote:
    >>
    >>> Roedy Green wrote:
    >>>> I am about to do some multithreading code to get some parallelism when
    >>>> waiting for Internet links to respond. The question comes How many
    >>>> threads? Well it depend on too many things. How much ram, how fast the
    >>>> connections, what else is going on in the machine.
    >>>
    >>> But hundreds is often no problem.
    >>>
    >>> Note: I have run this on Dual and Quad core Intel and AMD on Linux, not
    >>> Windows, so there could be a difference.

    >>
    >> I believe that linux's threading is much, much faster than windows'. But a
    >> few hundred threads could still be fine on windows.

    >
    > Does this believe have any base? Or is it just the usual Unix/Windows
    > flame?


    Stuff i read on the linux kernel mailing list a few years back. Which is
    not necessarily an affirmative answer to your first question!

    I can't cite proper data on this, i have to confess. We ought to extend
    this thread to a linux and a windows newsgroup ...

    tom

    --
    Imagine a city where graffiti wasn't illegal, a city where everybody
    could draw wherever they liked. Where every street was awash with a
    million colours and little phrases. Where standing at a bus stop was never
    boring. A city that felt like a living breathing thing which belonged to
    everybody, not just the estate agents and barons of big business. Imagine
    a city like that and stop leaning against the wall - it's wet. -- Banksy
     
    Tom Anderson, Aug 4, 2008
    #11
  12. Roedy Green

    Tom Anderson Guest

    On Mon, 4 Aug 2008, Roedy Green wrote:

    > and try to fit it say to a Cheyenne polynomial to try to predict the
    > optimal value.


    What's a Cheyenne polynomial? I don't think i've come across those.

    tom

    --
    Imagine a city where graffiti wasn't illegal, a city where everybody
    could draw wherever they liked. Where every street was awash with a
    million colours and little phrases. Where standing at a bus stop was never
    boring. A city that felt like a living breathing thing which belonged to
    everybody, not just the estate agents and barons of big business. Imagine
    a city like that and stop leaning against the wall - it's wet. -- Banksy
     
    Tom Anderson, Aug 4, 2008
    #12
  13. Tom Anderson wrote:
    > On Mon, 4 Aug 2008, Christian wrote:
    >
    >> Tom Anderson schrieb:
    >>> On Sun, 3 Aug 2008, Roger Lindsjö wrote:
    >>>
    >>>> Roedy Green wrote:
    >>>>> I am about to do some multithreading code to get some parallelism when
    >>>>> waiting for Internet links to respond. The question comes How many
    >>>>> threads? Well it depend on too many things. How much ram, how fast the
    >>>>> connections, what else is going on in the machine.
    >>>>
    >>>> But hundreds is often no problem.
    >>>>
    >>>> Note: I have run this on Dual and Quad core Intel and AMD on Linux,
    >>>> not Windows, so there could be a difference.
    >>>
    >>> I believe that linux's threading is much, much faster than windows'.
    >>> But a few hundred threads could still be fine on windows.

    >>
    >> Does this believe have any base? Or is it just the usual Unix/Windows
    >> flame?

    >
    > Stuff i read on the linux kernel mailing list a few years back. Which is
    > not necessarily an affirmative answer to your first question!
    >
    > I can't cite proper data on this, i have to confess. We ought to extend
    > this thread to a linux and a windows newsgroup ...
    >
    > tom
    >


    Sorry I came to this thread late but I remember here a while back I did
    some experiments on the number of threads on Windows XP. I found once
    you got past 75-100 things got really slow. That was on my dual core
    machine and one with more processors would probably do much better.

    I think in many cases you might do much better with schedulers and fewer
    threads. Less system overhead and memory use.

    --

    Knute Johnson
    email s/nospam/knute2008/

    --
    Posted via NewsDemon.com - Premium Uncensored Newsgroup Service
    ------->>>>>>http://www.NewsDemon.com<<<<<<------
    Unlimited Access, Anonymous Accounts, Uncensored Broadband Access
     
    Knute Johnson, Aug 4, 2008
    #13
  14. Roedy Green

    Roedy Green Guest

    On Mon, 04 Aug 2008 19:17:34 GMT, Roedy Green
    <> wrote, quoted or indirectly quoted
    someone who said :

    >say to a Cheyenne polynomial


    Ack. He is butchering my messages too. That said "Chebychev" in the
    original.
    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Aug 5, 2008
    #14
  15. Roedy Green

    Tom Anderson Guest

    On Tue, 5 Aug 2008, Roedy Green wrote:

    > On Mon, 04 Aug 2008 19:17:34 GMT, Roedy Green
    > <> wrote, quoted or indirectly quoted
    > someone who said :
    >
    >> say to a Cheyenne polynomial

    >
    > Ack. He is butchering my messages too.


    Ah. Oh dear.

    > That said "Chebychev" in the original.


    Oh, i see. That i can find on wikipedia. Although it still doesn't make a
    huge amount of sense to me!

    tom

    --
    Computation is the basis of all life
     
    Tom Anderson, Aug 5, 2008
    #15
  16. Roedy Green

    Roedy Green Guest

    On Tue, 5 Aug 2008 17:10:22 +0100, Tom Anderson <>
    wrote, quoted or indirectly quoted someone who said :

    >Oh, i see. That i can find on wikipedia. Although it still doesn't make a
    >huge amount of sense to me!


    Chebychev polynomial approximation is sort of polynomial curve fit,
    but better behaved than ordinary polynomials. They behave more like
    real-world curves do.

    The problem with curve fitting is in tends to work well for
    interpolation but goes wildly up or down off the ends for
    extrapolation. Probably reading up on mathematical techniques for
    safe extrapolation might be in order.

    My idea is thought that if you make an "error", adjusting the number
    of threads in a way that makes throughput worse, feedback will soon
    pull you back. You can consider any failure a worthwhile data
    gathering expedition.

    Probably almost any heuristic will work fine so long as you don't
    increase or decrease the thread count by too much at a pop.
    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
     
    Roedy Green, Aug 5, 2008
    #16
  17. Roedy Green wrote:
    > On Tue, 5 Aug 2008 17:10:22 +0100, Tom Anderson <>
    > wrote, quoted or indirectly quoted someone who said :
    >
    >> Oh, i see. That i can find on wikipedia. Although it still doesn't make a
    >> huge amount of sense to me!

    >
    > Chebychev polynomial approximation is sort of polynomial curve fit,
    > but better behaved than ordinary polynomials. They behave more like
    > real-world curves do.


    To be more precise, Chebyshev polynomials spread out the interpolation
    points to favor the end more, as regular polynomial interpolation near
    the ends of the range tends to oscillate wildly. That phenomenon is
    Runge's Phenomenon. Wikipedia has a good diagram of that, but it doesn't
    have one of the Chebyshev interpolation...

    > My idea is thought that if you make an "error", adjusting the number
    > of threads in a way that makes throughput worse, feedback will soon
    > pull you back. You can consider any failure a worthwhile data
    > gathering expedition.


    Doesn't sound like it would work if the were several local extrema. Then
    again, the graph (ignoring noise) would probably have only one extrema
    to begin with...


    --
    Beware of bugs in the above code; I have only proved it correct, not
    tried it. -- Donald E. Knuth
     
    Joshua Cranmer, Aug 5, 2008
    #17
  18. In article <JW0mk.289$xv.43@trnddc02>,
    Joshua Cranmer <> wrote:

    > Roedy Green wrote:
    > > On Tue, 5 Aug 2008 17:10:22 +0100, Tom Anderson <>
    > > wrote, quoted or indirectly quoted someone who said :
    > >
    > >> Oh, i see. That i can find on wikipedia. Although it still doesn't make a
    > >> huge amount of sense to me!

    > >
    > > Chebychev polynomial approximation is sort of polynomial curve fit,
    > > but better behaved than ordinary polynomials. They behave more like
    > > real-world curves do.

    >
    > To be more precise, Chebyshev polynomials spread out the interpolation
    > points to favor the end more, as regular polynomial interpolation near
    > the ends of the range tends to oscillate wildly. That phenomenon is
    > Runge's Phenomenon. Wikipedia has a good diagram of that, but it doesn't
    > have one of the Chebyshev interpolation...

    [...]

    Thanks, I hadn't seen this compelling graph before:

    <http://en.wikipedia.org/wiki/Runge's_phenomenon>

    From there, I see that "roots of the Chebyshev polynomial ... are often
    used as nodes in polynomial interpolation ... to minimize the problem of
    Runge's phenomenon":

    <http://en.wikipedia.org/wiki/Chebyshev_nodes>

    Which leads to an interesting contrast here:

    <http://en.wikipedia.org/wiki/Approximation_theory>

    [I'm still laughing over my earnest search for "Cheyenne polynomials,"
    now enshrined in my browser's history!]

    --
    John B. Matthews
    trashgod at gmail dot com
    home dot woh dot rr dot com slash jbmatthews
     
    John B. Matthews, Aug 5, 2008
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marco Ippolito
    Replies:
    0
    Views:
    2,619
    Marco Ippolito
    Oct 11, 2004
  2. =?Utf-8?B?U3R1?=

    session vars how many is to many ?

    =?Utf-8?B?U3R1?=, Mar 5, 2005, in forum: ASP .Net
    Replies:
    5
    Views:
    356
  3. dee
    Replies:
    2
    Views:
    417
  4. peelman

    How many threads is too many?

    peelman, Jan 13, 2005, in forum: Java
    Replies:
    12
    Views:
    874
    Esmond Pitt
    Jan 15, 2005
  5. rbt
    Replies:
    1
    Views:
    379
Loading...

Share This Page