Concurrent Containers

Discussion in 'C++' started by Scott Meyers, Sep 1, 2010.

  1. Scott Meyers

    Scott Meyers Guest

    Although C++0x includes support for concurrency, it offers no standard
    containers that support simultaneous readers and writers. Java and .NET
    developers have them as part of their standard libraries (e.g., Java's
    Concurrent HashMap and ConcurrentLinkedQueue, .NET's ConcurrentQueue and
    ConcurrentBag), but, from what I can tell, there don't even seem to be any de
    facto standard concurrent containers for C++ developers. For example, Boost
    doesn't seem to offer any.

    The closest thing I can find to a portable cross-platform API for concurrent
    containers are concurrent_vector and concurrent_queue offered independently by
    Intel via TBB and Microsoft via PPL, but where both companies have pledged to
    offer "identical concurrent STL container solutions" (per
    http://tinyurl.com/2cgr22p). (That same page suggests that Intel and MS both
    also offer concurrent_unordered_map, but that container is not listed at the PPL
    web site (http://msdn.microsoft.com/en-us/library/dd504906.aspx)).

    If I'm a cross-platform C++ developer who wants to take advantage of the work
    others have done to develop, debug, and tune a concurrent container (i.e., I
    don't want to write and maintain my own), what are my choices? Obviously, the
    container I choose will depend on the problem I'm trying to solve, but what's on
    the pre-built cross-platform concurrent container menu?

    Thanks,

    Scott

    --
    * C++ and Beyond: Meyers, Sutter, & Alexandrescu, Oct. 24-27 near Seattle
    (http://cppandbeyond.com/)
    * License my training materials for commercial (http://tinyurl.com/yfzvkp9) or
    personal use (http://tinyurl.com/yl5ka5p).
    Scott Meyers, Sep 1, 2010
    #1
    1. Advertising

  2. "Scott Meyers" <> wrote in message
    news:i5m4ee$ei9$...
    > Although C++0x includes support for concurrency, it offers no standard
    > containers that support simultaneous readers and writers.

    [...]
    > If I'm a cross-platform C++ developer who wants to take advantage of the
    > work others have done to develop, debug, and tune a concurrent container
    > (i.e., I don't want to write and maintain my own), what are my choices?
    > Obviously, the container I choose will depend on the problem I'm trying to
    > solve, but what's on the pre-built cross-platform concurrent container
    > menu?


    I am VERY sorry that I simply have no time to explain the details right now,
    but FWIW C++0x actually does provide a perfect medium for creating fairly
    efficient garbage collected concurrent containers. For instance, here is a
    full blown example of a garbage collected lock-free node-based stack that is
    based on one of my proxy collector algorithms:

    http://groups.google.com/group/comp.programming.threads/browse_frm/thread/a53f24de178b419f
    (please try to read entire thread...)

    And here is a relevant discussion on the Boost threading mailing list:

    http://thread.gmane.org/gmane.comp.lib.boost.devel/197400/focus=198747
    (please try to read entire thread...)



    Here is a brief description of what proxy garbage collection is "all about":

    http://groups.google.com/group/comp.programming.threads/msg/41f29efe33e7f124




    Here is recent implementation of one of my proxy garbage collector
    algorithms in Relacy 2.3:

    http://cpt.pastebin.com/f71480694




    Here is where you can learn about and download a copy of Relacy 2.3:

    http://groups.google.com/group/relacy

    http://groups.google.com/group/relacy/browse_frm/thread/8a7ecceb5f1bf80b


    IMHO, Relacy Race Detector is one of the _best_ synchronization verification
    systems out there!



    When I get more time I will explain how a proxy collector algorithm can
    efficiently solve the reader/writer problem in _many_ diverse situations.



    Here is some more info:

    http://webpages.charter.net/appcore/vzoom/round-1.pdf

    http://webpages.charter.net/appcore/vzoom/round-2.pdf

    FWIW, those "papers" referred to my vZOOM algorithm entry in Sun
    Microsystems CoolThreads contest:

    https://coolthreads.dev.java.net

    I was one of the finalists, and it got a brand new SunFire T2000. I was very
    pleased!

    ;^)
    Chris M. Thomasson, Sep 1, 2010
    #2
    1. Advertising

  3. Scott Meyers

    Balog Pal Guest

    "Scott Meyers" <>
    > Although C++0x includes support for concurrency, it offers no standard
    > containers that support simultaneous readers and writers. Java and .NET
    > developers have them as part of their standard libraries (e.g., Java's
    > Concurrent HashMap and ConcurrentLinkedQueue, .NET's ConcurrentQueue and
    > ConcurrentBag),


    I'm light on java but what I picked up about 'concurrent' containers, (and
    even more generally about using threads) is pretty sour. The early
    collections with everything synchronizes did not solve any real life use
    case but introduced slowdown. Later the programs were full of undefined
    behavior due to race conditions. LAter people swithhed to the 'concurrent'
    stuff thet converted UB to exceptions thrown here there and everywhere.

    In sunmmary I saw only different flavors of broken software, and supposed
    'architects' fighting to shuffle bugs around. Instead of looking at the
    fundamental issue/root cause: lack of MT design and lack of even recognition
    that one is needed. After all tha language is 'safe' and the tools have
    all thet 'synchronized' or 'concurrent' names.
    :-(((

    C/C++ was at least still put in some hurdles to infest a program with
    threads, and left some sanity.

    > but, from what I can tell, there don't even seem to be any de facto
    > standard concurrent containers for C++ developers. For example, Boost
    > doesn't seem to offer any.


    Is there such thing at all? Now really.

    I try to recall my own stuff: I used 'concurrent' queue, in fact many
    flavors of it, to communicate between threads. Despite it being
    ubiquitous as usage pattern I hardly have in in a stock library, but put
    together one in the new arena. Tuned to the actual way of use, and
    properties of payload.

    And observed others do in a similar way.

    As far as the standard: it failed miserably on the much simpler case to
    give us a *string*...

    On this topic I'd rather keep is conservative and teach people thinking MT,
    and able to use the elements in combination. I see the chances to mess up
    that way is lower than it would be from picking supposedly pre-assembled
    elements.
    Balog Pal, Sep 1, 2010
    #3
  4. Scott Meyers

    Scott Meyers Guest

    On 9/1/2010 11:50 AM, Chris M. Thomasson wrote:
    > "Scott Meyers"<> wrote in message
    >> If I'm a cross-platform C++ developer who wants to take advantage of the
    >> work others have done to develop, debug, and tune a concurrent container
    >> (i.e., I don't want to write and maintain my own), what are my choices?

    >
    > I am VERY sorry that I simply have no time to explain the details right now,
    > but FWIW C++0x actually does provide a perfect medium for creating fairly
    > efficient garbage collected concurrent containers.


    But that's not the question. C++ offers the capabilities to write anything I
    want, but if I want to use a container that's already written and is likely
    correct, I can use any of the containers in the standard library, as long as I
    don't care about the ability to concurrently modify the container. Most of the
    time, it's just not worth my trouble to write a container from scratch.

    What I'm looking for now are portable, proven containers that do support
    concurrent modification. Could I write one from scratch? Sure. But I'd rather
    spend my time doing other things if at all possible.

    Scott

    --
    * C++ and Beyond: Meyers, Sutter, & Alexandrescu, Oct. 24-27 near Seattle
    (http://cppandbeyond.com/)
    * License my training materials for commercial (http://tinyurl.com/yfzvkp9) or
    personal use (http://tinyurl.com/yl5ka5p).
    Scott Meyers, Sep 1, 2010
    #4
  5. On Sep 1, 12:15 pm, "Balog Pal" <> wrote:
    > "Scott Meyers" <>
    >
    > > Although C++0x includes support for concurrency, it offers no standard
    > > containers that support simultaneous readers and writers.  Java and .NET
    > > developers have them as part of their standard libraries (e.g., Java's
    > > Concurrent HashMap and ConcurrentLinkedQueue, .NET's ConcurrentQueue and
    > > ConcurrentBag),

    >
    > I'm light on java but what I picked up about 'concurrent' containers, (and
    > even more generally about using threads) is pretty sour. The early
    > collections with everything synchronizes did not solve any real life use
    > case but introduced slowdown. Later the programs were full of undefined
    > behavior due to race conditions. LAter people swithhed to the 'concurrent'
    > stuff thet converted UB to exceptions thrown here there and everywhere.
    >
    > In sunmmary I saw only different flavors of broken software, and supposed
    > 'architects' fighting to shuffle bugs around. Instead of looking at the
    > fundamental issue/root cause: lack of MT design and lack of even recognition
    > that one is needed.    After all tha language is 'safe' and the tools have
    > all thet 'synchronized' or 'concurrent' names.
    > :-(((
    >
    > C/C++ was at least still put in some hurdles to infest a program with
    > threads, and left some sanity.


    Agreed that the original Java "thread-safe" containers were uselessly
    slow from the synchronized methods which solved nothing. However, some
    of the newer Java containers, which Scott mentions, such as
    ConcurrentHashMap, do offer useful functionality, such as an atomic
    "put entry in map if not present". I've made use of that function
    several times.

    A blocking queue for multiple consumers and multiple consumers would
    be nice too. IIRC, Java has several flavors here, such as one with a
    capped size based on an array - attempts to push to a full queue
    blocked the producer, and another with a limitless size so that the
    producer is never blocked.
    Joshua Maurice, Sep 1, 2010
    #5
  6. Scott Meyers

    Balog Pal Guest

    "Scott Meyers" <>

    > But that's not the question. C++ offers the capabilities to write
    > anything I want, but if I want to use a container that's already written
    > and is likely correct, I can use any of the containers in the standard
    > library, as long as I don't care about the ability to concurrently modify
    > the container. Most of the time, it's just not worth my trouble to write
    > a container from scratch.
    >
    > What I'm looking for now are portable, proven containers that do support
    > concurrent modification.


    Please define 'correct' for the scope of your discussion.

    I recall a plenty of good articles on the topic of collection vs. MT -- or
    rather on what is expected from the class to be called 'thread-safe' and
    what is not. Not sure if it was Herb Sutter or someone else. With very
    good examples on how a list or vector shall rptect its internal state on
    some operations ((that the user can't ralisticly cover or even recognise as
    a possible problem) -- while still leaving the obligation to lock around
    for others -- where the collection can't really guess the intent.

    In real life critical sections you protect a set of data -- and that set is
    IMO too rarely match what we call 'container' in C++. It is mos likely a
    class, a few members of a class, a few unrelated objects, or some existing
    elements in a container.

    As a transaction must involve them together, something a collection can
    offer is more in the way than helps. Generally.
    Balog Pal, Sep 1, 2010
    #6
  7. Scott Meyers

    Balog Pal Guest

    "Joshua Maurice" <>

    > However, some
    >of the newer Java containers, which Scott mentions, such as
    >ConcurrentHashMap, do offer useful functionality, such as an atomic
    >"put entry in map if not present". I've made use of that function
    >several times.


    Sure, the progress with the collections is clear -- just they are still no
    silver bullets (obviously) and can't replace the missing design elements.

    >A blocking queue for multiple consumers and multiple consumers would
    >be nice too.


    This far it did not fit too well with the STL approach. As current STL is
    full of copy operations -- and my queues were more likely holding pointers.
    With allocation ownership issues placed to different parties. I was meny
    times tempted to use some smart pointers there, especially shared_ptr, then
    dropped it.

    For other cases the payload supported swap, and that was used for good.

    As move semantics enter, the picture may change here.


    >IIRC, Java has several flavors here, such as one with a
    >capped size based on an array - attempts to push to a full queue
    >blocked the producer, and another with a limitless size so that the
    >producer is never blocked.


    See, the proliferation starts here. And in C++ you should add a policy for
    move, a policy for allocation/destroy (that could be different that the
    %$#@#%$ allocator<> we carry around). How far we got in practice with
    customizable things in std this far? After all we can create a locale and
    imbue it in a stream, can we? ;-o
    Balog Pal, Sep 1, 2010
    #7
  8. Scott Meyers

    Scott Meyers Guest

    On 9/1/2010 1:01 PM, Balog Pal wrote:
    > Please define 'correct' for the scope of your discussion.


    Correct means what it always means: the API does what it promises (e.g., all
    invariants and postconditions are satisfied). For example, if I have a
    ConcurrentSet that tells me I may concurrently perform multiple inserts, after
    the concurrent inserts are done, I should find exactly one copy of each thing
    that was inserted and zero copies of things that were not inserted.

    > As a transaction must involve them together, something a collection can offer is
    > more in the way than helps. Generally.


    Concurrent containers don't claim to solve all concurrency problems, nor do they
    promise that clients will never have to worry about concurrency issues
    (including transactional issues such as you mention). They simply promise that
    certain kinds of operations that would not normally be safe if performed
    concurrently on a non-concurrent container are safe on the concurrent container.
    So a concurrent queue (useful for producer/consumer systems) permits enqueue
    and dequeue operations to proceed concurrently. A concurrent hash table
    (potentially useful for, e.g., a cache shared across threads) might permit
    concurrent insertions.

    A *lot* of effort has gone into developing concurrent containers in Java and
    ..NET, as well as in C++ libraries such as TBB and PPL. To suggest that all that
    effort has been misguided, because there are no common use cases for such
    containers, is untenable.

    Are there wrong ways to write concurrent containers? Sure. Did Java employ
    them in its synchronized collections. It did. But neither of those
    observations reflects on the fundamental utility of the idea of concurrent
    containers. That question, IMO, has been settled for some time.

    Scott

    --
    * C++ and Beyond: Meyers, Sutter, & Alexandrescu, Oct. 24-27 near Seattle
    (http://cppandbeyond.com/)
    * License my training materials for commercial (http://tinyurl.com/yfzvkp9) or
    personal use (http://tinyurl.com/yl5ka5p).
    Scott Meyers, Sep 1, 2010
    #8
  9. Scott Meyers

    Balog Pal Guest

    "Scott Meyers" <>

    >> Please define 'correct' for the scope of your discussion.

    >
    > Correct means what it always means: the API does what it promises (e.g.,
    > all invariants and postconditions are satisfied). For example, if I have
    > a ConcurrentSet that tells me I may concurrently perform multiple inserts,
    > after the concurrent inserts are done, I should find exactly one copy of
    > each thing that was inserted and zero copies of things that were not
    > inserted.
    >
    >> As a transaction must involve them together, something a collection can
    >> offer is
    >> more in the way than helps. Generally.

    >
    > Concurrent containers don't claim to solve all concurrency problems, nor
    > do they promise that clients will never have to worry about concurrency
    > issues (including transactional issues such as you mention). They simply
    > promise that certain kinds of operations that would not normally be safe
    > if performed concurrently on a non-concurrent container are safe on the
    > concurrent container. So a concurrent queue (useful for producer/consumer
    > systems) permits enqueue and dequeue operations to proceed concurrently.
    > A concurrent hash table (potentially useful for, e.g., a cache shared
    > across threads) might permit concurrent insertions.
    >
    > A *lot* of effort has gone into developing concurrent containers in Java
    > and .NET, as well as in C++ libraries such as TBB and PPL. To suggest
    > that all that effort has been misguided, because there are no common use
    > cases for such containers, is untenable.


    Appears I compressed my thoughts too far, to lose essential elements.
    And possibly captured the scopoe of your question incorrectly too.

    I was thinking in the scope of a (possible) C++ standard. The problem is
    exactly what you emphasice above. A *lot* of effort was put in that field.
    And the results are not so bright even in their native environment. But
    IMO (!_ even if the results were perfect in java/C#, poted to the C++
    environment would not be good enough.

    Especially compared to to effort of a creating them by hand.

    IMO people who can pick java, will do so. Those who use C++ do that for
    reasons like they need the control and performance.

    Having some collection that can magically withstand all kinds of operations
    from random threads looks really nice. Really. Having some time I'd jump
    on it to analyse how it is done and applaud.

    But I doubt I'd use such a generic solution in my app. Because having random
    threads keep poking a shared collection sounds like a blasphemy. With MT
    design we fight to limit sharing as much as possble, for good reasons.
    Sharing objects, sharing operations, etc. And using a proper sync op
    around what remains shared is not a problem, especially since RAII was
    invented to guard the mutex lock.

    And going back to the C++ standard, the people creating it (I guess) have
    fragment of resources that owners of java and C# has at disposal.

    That is why I don't expect to see it there anytime soon (read: in my
    lifetime).
    And I see the lack of 'de facto standard' going to similar causes.

    The java way is to have everything and the pussycat in the lib. The lib is
    written by Sun. Or was to yesteryear. And it is considered good enough.

    C++ users have no such supplier, and on the other end are more picky. Both
    in width and limits of expectations. After all the 'you don't pay for what
    you don't use' is still in place. As I can visualize the implementtion of
    a concurrent container, and its actual use in a real program -- I'm skeptic
    whether it can hold.

    To put it another way, in my view in the C++ environment the idea has a bad
    tradeoff. But I do not claim my view to be some objective truth, neither
    shall it be extrapolated outside its scope.

    Boost covers so much stuff and has that fine selection of developers. And it
    does have components for both threading and collections. But not for
    concurrent collections. My explanations to "why" is that people there also
    found it a poor tradeoff. I'm open to hear any other explanation.


    And rereading the initial post, your question was what is the menu, not why
    is it so thin -- sorry to kinda sidetrack it.
    Balog Pal, Sep 1, 2010
    #9
  10. Scott Meyers

    tni Guest

    On 2010-09-01 22:29, Scott Meyers wrote:
    >
    > A *lot* of effort has gone into developing concurrent containers in Java
    > and .NET, as well as in C++ libraries such as TBB and PPL. To suggest
    > that all that effort has been misguided, because there are no common use
    > cases for such containers, is untenable.
    >
    > Are there wrong ways to write concurrent containers? Sure. Did Java
    > employ them in its synchronized collections. It did. But neither of
    > those observations reflects on the fundamental utility of the idea of
    > concurrent containers. That question, IMO, has been settled for some time.


    Really? There are a couple of useful concurrent containers: messaging
    infrastructure (queues) and maybe maps for certain applications. Beyond
    that, the value rapidly diminishes.

    If the container interface doesn't allow you to perform the transactions
    you need, it's useless. Very often, more than a single container is
    going to be involved in a transaction.

    What are those magic containers of general value that you are talking about?

    \\

    In general a mutex should be be cheap. I.e. if you look at Linux
    Pthreads or Qt, a lock / unlock will be a single atomic operation in the
    uncontested case. For most applications, a concurrent container won't do
    better than just using a standard container with a mutex (even if it's
    some fancy lock-free whatever). With RAII, exception safety for the
    locking operation can be trivially had.
    tni, Sep 1, 2010
    #10
  11. Scott Meyers

    Scott Meyers Guest

    On 9/1/2010 2:34 PM, Paavo Helde wrote:
    > Yes, sometimes such containers are exactly what is needed. However, given
    > that one can in about 20 lines create a template class which is able to
    > wrap any non-concurrent class and expose a locking proxy pointer for safe
    > concurrent access, this is not so difficult to achieve.


    Can you elaborate on what you mean here? Suppose I want to take a std::map or
    std::unordered_map (C++0x) and make it safe for multiple threads to
    simultaneously perform insertions. What will a "locking proxy pointer" do to
    enable this behavior?

    Scott

    --
    * C++ and Beyond: Meyers, Sutter, & Alexandrescu, Oct. 24-27 near Seattle
    (http://cppandbeyond.com/)
    * License my training materials for commercial (http://tinyurl.com/yfzvkp9) or
    personal use (http://tinyurl.com/yl5ka5p).
    Scott Meyers, Sep 1, 2010
    #11
  12. Scott Meyers

    Balog Pal Guest

    "Scott Meyers" <>
    >> Please define 'correct' for the scope of your discussion.

    >
    > Correct means what it always means: the API does what it promises (e.g.,
    > all invariants and postconditions are satisfied). For example, if I have
    > a ConcurrentSet that tells me I may concurrently perform multiple inserts,
    > after the concurrent inserts are done, I should find exactly one copy of
    > each thing that was inserted and zero copies of things that were not
    > inserted.


    So, if we had a ConcurrentQueue, it would not really fit its most likely MT
    use case. That includes alerting the consumer on produce...

    If you don't mind sidetracking, could you make up a real use case for the
    example set you described above? Anything that I could think keeps ringing
    'race condition'. If I left the set pray to threads, how can I learn the
    postcondition.invariant is actually held? How do I know something is in
    there?

    Possibly my mind is too petrified to abandon 'you shall lock at least till
    the if() part finishes, but more likely until the whole conditional.

    Ot it is part of the supposed API to invoke equivalent of java's
    synchronized(mySet) { } ?

    >> As a transaction must involve them together, something a collection can
    >> offer is
    >> more in the way than helps. Generally.

    >
    > Concurrent containers don't claim to solve all concurrency problems, nor
    > do they promise that clients will never have to worry about concurrency
    > issues (including transactional issues such as you mention).


    That sounds like pair<mutex, somecollection> where I can skip locking the
    mutex some 30% of the times.

    No pun intended, I just try to match the thing with the good advise 'make
    interfaces easy to use correctly and hard to use incorrectly'.

    Practice shows just too many incorrect uses -- so the main defense is to
    avoid the situations altogether.

    >They simply promise that certain kinds of operations that would not
    >normally be safe if performed concurrently on a non-concurrent container
    >are safe on the concurrent container. So a concurrent queue (useful for
    >producer/consumer systems) permits enqueue and dequeue operations to
    >proceed concurrently.


    As mentioned above that is nice but redundant in most actual uses, where you
    couple some cond/event.

    My designer's guess would say the demand is for not a concurrent queue as a
    generic container, but for a X-producer/Y-consumer message queue, covering
    those very use cases.

    > A concurrent hash table (potentially useful for, e.g., a cache shared
    > across threads) might permit concurrent insertions.


    Try to imagine this interface, Could work if:
    - no item invalidation (though delete allowed)
    - fetch makes copy or uses refcounting (or no delete allowed at all, making
    it one-way).

    Don't you feel it a fragile element?
    Balog Pal, Sep 1, 2010
    #12
  13. Scott Meyers

    Scott Meyers Guest

    On 9/1/2010 5:11 PM, Pete Becker wrote:
    > I think this is the wrong level for design. It's gotten lost over the years, but
    > containers should typically implementation details, not high-level design
    > objects. If you need an object that can be used from multiple threads to store
    > objects and search for an object, its interface might look like this:
    >
    > template <class Ty>
    > class data_table
    > {
    > public:
    > void store(const Ty&);
    > const Ty& find(whatever);
    > private:
    > // none of your business! (but see below)
    > };
    >
    > The implementation might well be a hash table, but the public interface is far
    > smaller than the interface for a hash table, and it's properly synchronized for
    > the operations that it provides.


    It's too late to change now, I think, but in retrospect I should have used a
    subject of "Concurrent Data Structures" for this thread, because the term
    "containers" suggests to many people all the usual STL API stuff, and I'm not
    married to that API. In fact I'd be surprised if concurrent containers can
    offer the same API. Java's ConcurrentHashMap, for example, has a size method
    that returns only an approximation of the number of elements in the map,
    because, as Brian Goetz puts it in "Java Concurrency in Practice," "the result
    of size could be out of date by the time it is computed."

    What I'm really interested in is the availability of
    already-implemented-and-proven commonly-useful concurrency-friendly data
    structure types for C++. One that's been mentioned by more than one person
    (other than me) in this thread is a queue, and it's probably no coincidence that
    both TBB and PPL offer concurrent_queue. They also both offer concurrent_vector
    and (apparently) concurrent_unordered_map, which suggests to me that uses for
    those types are not particularly difficult to come by.

    Scott

    --
    * C++ and Beyond: Meyers, Sutter, & Alexandrescu, Oct. 24-27 near Seattle
    (http://cppandbeyond.com/)
    * License my training materials for commercial (http://tinyurl.com/yfzvkp9) or
    personal use (http://tinyurl.com/yl5ka5p).
    Scott Meyers, Sep 2, 2010
    #13
  14. Scott Meyers

    Scott Meyers Guest

    On 9/1/2010 6:03 PM, Pete Becker wrote:
    > don't sound like a good idea. Thread safety has to be designed in at the
    > application level; there's no set of tools that will make a bad design safe.


    Clearly. My impression is that the primary goal of concurrent data structures
    isn't thread safety per se, but rather thread safety (for some defined set of
    presumably useful operations) that scales well. Locking an entire data
    structure to operate on only a part of it does not scale well.

    Scott

    --
    * C++ and Beyond: Meyers, Sutter, & Alexandrescu, Oct. 24-27 near Seattle
    (http://cppandbeyond.com/)
    * License my training materials for commercial (http://tinyurl.com/yfzvkp9) or
    personal use (http://tinyurl.com/yl5ka5p).
    Scott Meyers, Sep 2, 2010
    #14
  15. Scott Meyers

    Pavel Guest

    Scott Meyers wrote:
    > Although C++0x includes support for concurrency, it offers no standard
    > containers that support simultaneous readers and writers. Java and .NET
    > developers have them as part of their standard libraries (e.g., Java's
    > Concurrent HashMap and ConcurrentLinkedQueue, .NET's ConcurrentQueue and
    > ConcurrentBag), but, from what I can tell, there don't even seem to be
    > any de facto standard concurrent containers for C++ developers. For
    > example, Boost doesn't seem to offer any.
    >
    > The closest thing I can find to a portable cross-platform API for
    > concurrent containers are concurrent_vector and concurrent_queue offered
    > independently by Intel via TBB and Microsoft via PPL, but where both
    > companies have pledged to offer "identical concurrent STL container
    > solutions" (per http://tinyurl.com/2cgr22p). (That same page suggests
    > that Intel and MS both also offer concurrent_unordered_map, but that
    > container is not listed at the PPL web site
    > (http://msdn.microsoft.com/en-us/library/dd504906.aspx)).
    >
    > If I'm a cross-platform C++ developer who wants to take advantage of the
    > work others have done to develop, debug, and tune a concurrent container
    > (i.e., I don't want to write and maintain my own), what are my choices?
    > Obviously, the container I choose will depend on the problem I'm trying
    > to solve, but what's on the pre-built cross-platform concurrent
    > container menu?
    >
    > Thanks,
    >
    > Scott
    >

    If history is any guidance:

    The very first version of Java (1.0.x) had all containers thread-safe
    (java.util.Vector and java.util.Hashtable are probably the best-known.
    They are still around). It was immediately noticed that canned
    "synchronized" containers can't be used efficiently except for very
    simple problems (mainly tutorial examples). A state for real-world
    problem generally consists of more than one standard-library object
    (say, a Vector *and a* Hashtable or a Vector and an int) and the whole
    state has to be thread-safe. As soon as the problem grows to this
    (admittedly modest) size, the thread-safety of underlying parts becomes
    pure liability (in both performance and lost opportunities to design a
    more convenient but non-thread-safe API).

    Java creators learned the lesson fast. The second generation of
    collection classes was all thread-unsafe and much faster. To hedge their
    bets, they provided synchronized adapters to every container interface
    that a client code could create, for example as follows:

    List list = Collections.synchronizedList(new ArrayList(...)); // here,
    ArrayList is an analog of Vector, but thread-unsafe, and the resulting
    list has a thread-safe List API

    I wrote tons of Java programs, most of them multi-threaded and has never
    had a good cause to use one of these adapters (a couple of times I
    tried, as a quick hack and just for the sake of it and payed dearly for
    my incomplete analysis). Other Java people whom I know agree.

    Standardizing inter-thread communication mechanisms may require a
    "thread-safe" container class (usually a kind of a queue). In my
    experience, however, any such standardized mechanism required a slightly
    different queue. Some questions to answer -- whether you want to
    enqueue/dequeue single objects or batches, how to delineate batches, how
    to mix in system-specific asynchronous sinks and sources like file
    descriptors, middleware, databases, IPC structures, whether you want to
    serialize or copy your objects, whether you want to pre-allocate "slots"
    on the queue and serialize right in and out or you want to copy objects
    as is or you want to store pointers, which threads frees slot memory,
    whether you can take advantage of single producer and/or consumer or you
    have multiple consumers/producers, are you after high throughput, low
    latency on average or real-time, etc etc etc. An answer to any of the
    above questions will affect the optimal design of the container (both
    API and implementation). And, if you want threads at all, performance is
    probably important to you so you will not want to leave a single stone
    unturned.

    To summarize: Thread-safety is best applied at highest possible level of
    application design. Memory should be shared in controlled number of
    structures and these are to be selected / designed based on the concrete
    application's requirements. The idea of standard thread-safe containers
    failed before and we know the reasons. I do not see what has changed
    since so that it could succeed if brought up again now.

    -Pavel
    Pavel, Sep 2, 2010
    #15
  16. Scott Meyers

    Balog Pal Guest

    "Scott Meyers" <>
    >Java's ConcurrentHashMap, for example, has a size method that returns only
    >an approximation of the number of elements in the map, because, as Brian
    >Goetz puts it in "Java Concurrency in Practice," "the result of size could
    >be out of date by the time it is computed."


    Ah, Bingo!!!

    My previous question on "define correct in the context" was supposed to
    bring out something like this.

    My normal definition of correct size() for a container to know *the* number.
    That is NOT delivered by the concurrent one. It fairly says so, meeting what
    you answered a little up. But IMO this does not really meet what
    programmers would "want" or expect in a real situation. rendering the
    interface either being ballast or a landmine for too many potential
    situations.


    > What I'm really interested in is the availability of
    > already-implemented-and-proven commonly-useful concurrency-friendly data
    > structure types for C++.


    For this, as C++ gains native threading and move semantics hopefully soon I
    expect such libraries/tools will finally emerge in the next few years.
    Balog Pal, Sep 2, 2010
    #16
  17. Scott Meyers

    Scott Meyers Guest

    Concurrent Data Structures (Was: Concurrent Containers)

    On 9/1/2010 3:29 PM, Balog Pal wrote:
    > Possibly my mind is too petrified to abandon 'you shall lock at least till the
    > if() part finishes, but more likely until the whole conditional.


    I don't know about petrified, but I think that (1) you are assuming that the API
    for a concurrent data structure will look like that for the current STL data
    structures (which, as I posted earlier, is perhaps my fault for wording my
    subject line poorly; I've modified it above) and (2) you are assuming that the
    kinds of operations and combinations of operations that are common on existing
    STL containers (i.e., on concurrency-unfriendly data structures) will also be
    common on concurrency-friendly data structures.

    As regards API, the TBB API for concurrent_queue includes a try_pop function,
    so, as the documentation puts it, what you might write like this with an STL queue,

    bool b=!q.empty();
    if(b) {
    x=q.front();
    q.pop();
    }

    you write like this with a concurrent_queue:

    bool b = q.try_pop(x); // x holds the popped value if there was an
    // element to pop

    So your "you shall lock at least till the if() part finishes" advice falls by
    the wayside, because there is no if() part. The if and pop are an atomic
    combination.

    For an example application of a concurrent hash_map, take a look at
    http://www.devx.com/cplus/Article/33334/1954, where an engineer from Intel (in
    an article designed to make TBB scalability look good) describes how a
    concurrent hash map can be used to allow multiple threads to process different
    parts of a large text simultaneously. Each thread performs only inserts, in
    contrast to the combination of inserts, lookups, and erasures that you seem to
    think are always present. (The article doesn't mention it, but once the inserts
    are finished, the application could presumably use an external mutex to restrict
    access to the data structure if threads needed to modify it.)

    Scott

    --
    * C++ and Beyond: Meyers, Sutter, & Alexandrescu, Oct. 24-27 near Seattle
    (http://cppandbeyond.com/)
    * License my training materials for commercial (http://tinyurl.com/yfzvkp9) or
    personal use (http://tinyurl.com/yl5ka5p).
    Scott Meyers, Sep 2, 2010
    #17
  18. Scott Meyers

    Scott Meyers Guest

    On 9/1/2010 10:41 PM, Paavo Helde wrote:
    > Here is a simple scratch, this can be enhanced no doubt. It essentially
    > locks the whole container during the operation. As stressed by numerous
    > other replies, this locking scope is often either too narrow or too
    > broad, and as such this solution does not buy you much.


    Nor does it scale, which is the primary motivation for concurrent data structures.

    Scott

    --
    * C++ and Beyond: Meyers, Sutter, & Alexandrescu, Oct. 24-27 near Seattle
    (http://cppandbeyond.com/)
    * License my training materials for commercial (http://tinyurl.com/yfzvkp9) or
    personal use (http://tinyurl.com/yl5ka5p).
    Scott Meyers, Sep 2, 2010
    #18
  19. Scott Meyers

    Scott Meyers Guest

    On 9/2/2010 10:10 AM, Paavo Helde wrote:
    > Scott Meyers<> wrote in
    >> Nor does it scale, which is the primary motivation for concurrent data
    >> structures.

    >
    > It depends on what it is used for.


    I don't think it does. Either something scales or it does not. What you're
    saying is that some applications have no need for something that scales, in
    which case a scalable design is immaterial. That's true, but orthogonal. If
    you lock an entire data structure that's accessed by multiple threads, access to
    that data structure will eventually become a bottleneck if you throw enough
    threads at it. (The programming community has enough experience with such
    designs for me to state this as a fact.) You may have an application where
    there are never enough threads to reach the bottleneck, in which case an
    unscalable design is fine. But the fact that it's fine for your application
    does not make it scalable.

    > If the contention is low, then there
    > is no need to build something more fancy.


    True. But orthogonal to my comment about the scalability of a design based on
    locking an entire data structure for each access.

    > On the other hand, if the contention is high, then it might well be a
    > design mistake to have such a global data structure present.


    Also true. Also orthogonal. Having a scalable data structure available does
    not mean it is necessarily the best data structure for your application. But
    the fact that it is not the best data structure for a particular application
    does not mean that the data structure is not scalable.

    > Even if it
    > can be done 10 times faster by using some dedicated concurrent data
    > structure, it still scales only 10 times better.


    A design that runs 10 times faster isn't necessarily more scalable, it's just
    faster. For the case we are discussing, scalability is determined by whether
    performance can be maintained as the number of threads using it for some set
    combination of operations increases. If data structure A can handle 10 threads
    before performance drops off and data structure B can handle 100 threads before
    its performance drops off, B is 10 times more scalable than A. But if B runs 10
    times faster than A, but both have performance drop-offs starting at the same
    number of threads, neither is more scalable than the other.

    Scott

    --
    * C++ and Beyond: Meyers, Sutter, & Alexandrescu, Oct. 24-27 near Seattle
    (http://cppandbeyond.com/)
    * License my training materials for commercial (http://tinyurl.com/yfzvkp9) or
    personal use (http://tinyurl.com/yl5ka5p).
    Scott Meyers, Sep 2, 2010
    #19
  20. Scott Meyers

    Balog Pal Guest

    Re: Concurrent Data Structures (Was: Concurrent Containers)

    "Scott Meyers" <>
    > On 9/1/2010 3:29 PM, Balog Pal wrote:
    >> Possibly my mind is too petrified to abandon 'you shall lock at least
    >> till the
    >> if() part finishes, but more likely until the whole conditional.

    >
    > I don't know about petrified, but I think that (1) you are assuming that
    > the API for a concurrent data structure will look like that for the
    > current STL data structures (which, as I posted earlier, is perhaps my
    > fault for wording my subject line poorly; I've modified it above) and (2)
    > you are assuming that the kinds of operations and combinations of
    > operations that are common on existing STL containers (i.e., on
    > concurrency-unfriendly data structures) will also be common on
    > concurrency-friendly data structures.


    Probably so. :) As I mentioned earlier STL's interfaces are especially
    hostile for these uses.

    But even the other regular interface for 'collection; like stuff -- and even
    for simple aggregates -- is bad for general cases.

    The examples you gave are okay: mutating operations, like insert are about
    good, and may turn useful here and there. The problem is with read
    operations. What include all kinds of stuff retrieval, direct or indirect,
    from a non-const object. The container has no chance to cover it up.
    What is worse, my experience is that too many collegues believe that to have
    thread-safe thing is to protect concurrent writes against each other. And
    that read can go without lock. (if you recall the mass discussion of the
    double-checked nightmare, it is just the surfece.

    Misleading names don't help either. In another post you quoted the dox on
    ..size(). it is unfortunately not called approx_size() or outdated_size()
    or bogus_size(), what reflects the semantics, but just size(). That
    normally report, well, the count of elements. Firmly and correctly. Just
    as its twin IsEmpty() [ empty() in STL :-( ]. For a 'concurrent' thing it
    reports bogus info that we in turn use to make decisions. Not good.

    > As regards API, the TBB API for concurrent_queue includes a try_pop
    > function, so, as the documentation puts it, what you might write like this
    > with an STL queue,
    >
    > bool b=!q.empty();
    > if(b) {
    > x=q.front();
    > q.pop();
    > }
    >
    > you write like this with a concurrent_queue:
    >
    > bool b = q.try_pop(x); // x holds the popped value if there was an
    > // element to pop


    Yeah, this looks much better. (Actually at least one of my async_queue
    classes has this same interface. )

    But if I started fresh to create a multiproducer-singleconsumer queue for
    the usual interthread messaging, it would not have pop at all. Only push
    for producers, and some functor, that is the consumer, and gets called
    whenever there is stuff -- with the already extracted message.

    Not looking like a 'collection' at all, but rather like a
    manager/dispatcher/...

    > So your "you shall lock at least till the if() part finishes" advice falls
    > by the wayside, because there is no if() part.


    Good, the point is that the interface must not have a single function that
    can be used that way, and be unsafe.

    > The if and pop are an atomic combination.


    Yeah. Though it still miss the most often coupled signaling, and if you have
    to make it by hand, there will (too likely) be excess locks and context
    switches.

    (And my other implementtions do not bother with pop -- but create/pick an
    empty queue and swap it after waking on the signal -- then process the
    grabbed content at leasure.)

    > For an example application of a concurrent hash_map, take a look at
    > http://www.devx.com/cplus/Article/33334/1954, where an engineer from Intel
    > (in an article designed to make TBB scalability look good) describes how a
    > concurrent hash map can be used to allow multiple threads to process
    > different parts of a large text simultaneously. Each thread performs only
    > inserts, in contrast to the combination of inserts, lookups, and erasures
    > that you seem to think are always present.


    Err, I surely do not think they are always present ;-) When I design an
    app, I carefully consider what operations happen where, document it, and do
    locking (and other operations) accordingly. That normally makes the
    solution *not* general, and not "reusable". Certainly that is not a
    requirement, and the implementation itself is not worth mentioning in the
    time measure -- the hard part is the design itself.

    The hash_map you refer is supposedly a reusable component, from which we
    expect an easily discoverable, reasonably safe interface. I hope we agree
    that intuitive names for classes and functions are vital. From soemthing
    called concurent_hash_map I'd probably not expect to be limited for such
    asymmetric use. And a half-page of description on limitations and intended
    uses are too rarely representable in just names.

    Also, if the component covers my MT work only halfway -- I have to use
    explicit synch incantations for at least some operations, then the gain on
    the other operations is moot.

    Also, if the object does locking that is not evidently showing its spots, it
    is easy to create deadlock situations, you only need another mutex, say from
    another such hash_map.
    Balog Pal, Sep 2, 2010
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Nicolas Matringe
    Replies:
    9
    Views:
    722
    Mike Treseler
    Jun 14, 2004
  2. Taras_96

    Concurrent Assignment

    Taras_96, Apr 1, 2005, in forum: VHDL
    Replies:
    6
    Views:
    8,518
    Taras_96
    Apr 4, 2005
  3. Pep
    Replies:
    6
    Views:
    819
  4. Replies:
    7
    Views:
    549
    Pete Becker
    Jan 25, 2008
  5. Sebastian Mach
    Replies:
    5
    Views:
    307
Loading...

Share This Page