how to achieve thread safety at element level in STL data structureslike hash map, vectors

Discussion in 'C++' started by grbgooglefan, Dec 4, 2007.

  1. grbgooglefan

    grbgooglefan Guest

    Our application uses the caching heavily to store the data from
    databases & also the runtime orders information.
    All these caches are built on STL hash map, vectors & maps and data
    format is the structures.
    There are multiple threads accessing these caches simultaneously for
    reading the data as well as updating the data.
    Whenever any thread accesses the cache, it locks that cache & finds
    the required element. Does the actions required & unlocks the cache.
    I am finding that this is causing the other threads to wait for longer
    time because the locking is at cache level.
    I would like to make this locking quite finer & granular, in such a
    way that only the single structure which is to be updated is locked.
    This is somewhat same as the row level locking in databases.
    How do we achieve this level of granular locking?
    Thoughts please.
     
    grbgooglefan, Dec 4, 2007
    #1
    1. Advertising

  2. grbgooglefan

    Boris Guest

    Re: how to achieve thread safety at element level in STL data structures like hash map, vectors

    On Tue, 04 Dec 2007 04:22:10 +0200, grbgooglefan <>
    wrote:

    > Our application uses the caching heavily to store the data from
    > databases & also the runtime orders information.
    > All these caches are built on STL hash map, vectors & maps and data
    > format is the structures.
    > There are multiple threads accessing these caches simultaneously for
    > reading the data as well as updating the data.
    > Whenever any thread accesses the cache, it locks that cache & finds
    > the required element. Does the actions required & unlocks the cache.
    > I am finding that this is causing the other threads to wait for longer
    > time because the locking is at cache level.
    > I would like to make this locking quite finer & granular, in such a
    > way that only the single structure which is to be updated is locked.
    > This is somewhat same as the row level locking in databases.
    > How do we achieve this level of granular locking?


    From what I understand you are using a C++ standard library which does not
    supported multithreading explicitly? Thus it's your job to make sure that
    everything works fine in a multithreaded program? To improve performance
    two different kind of locks (one for read- and one for write-access) come
    to my mind. But as you didn't say which C++ standard library you use I
    don't know if two threads may access your STL containers concurrently even
    if it's only for read access.

    Boris
     
    Boris, Dec 4, 2007
    #2
    1. Advertising

  3. grbgooglefan

    James Kanze Guest

    Re: how to achieve thread safety at element level in STL datastructures like hash map, vectors

    On Dec 4, 3:22 am, grbgooglefan <> wrote:

    > Our application uses the caching heavily to store the data from
    > databases & also the runtime orders information.
    > All these caches are built on STL hash map, vectors & maps and data
    > format is the structures.
    > There are multiple threads accessing these caches simultaneously for
    > reading the data as well as updating the data.
    > Whenever any thread accesses the cache, it locks that cache & finds
    > the required element. Does the actions required & unlocks the cache.
    > I am finding that this is causing the other threads to wait for longer
    > time because the locking is at cache level.
    > I would like to make this locking quite finer & granular, in such a
    > way that only the single structure which is to be updated is locked.
    > This is somewhat same as the row level locking in databases.
    > How do we achieve this level of granular locking?


    It depends. If you're not inserting or erasing on the map, then
    you don't need a lock on it; only on the individual elements.
    If you are (and if you're using it as a cache, you probably
    are), then you need two locks for each transaction, one on the
    element, and one on the table. Basically, you grab the lock on
    the table, get the element, grab the lock associated with it,
    and release the lock on the table. To insert, a lock on the
    table is sufficient, but to erase, you need a lock on both the
    table and the object, then erase the object from the map,
    release the two locks, and delete.

    Note that as described above, the second thread trying to
    acquire the lock on an already locked object will hold the table
    locked. You might want to consider using pthread_mutex_trylock,
    or its equivalent under your system, and backing off, releasing
    the lock on the table, and trying again later, when you can't
    get the lock. More complicated (and fairer) schemes are also
    possible.

    If I were doing this, I'd probably wrap the table, so that the
    user didn't have to worry about this, and have the getter return
    a reference counted smart pointer which releases the lock on the
    object when the last pointer (of the set of reference counted
    smart pointers) disappears. Boost::shared_ptr works well for
    this, provided you provide an appropriate destructor object when
    you create the pointer.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Dec 4, 2007
    #3
  4. grbgooglefan

    grbgooglefan Guest

    Re: how to achieve thread safety at element level in STL datastructures like hash map, vectors

    On Dec 5, 12:35 am, James Kanze <> wrote:
    > On Dec 4, 3:22 am, grbgooglefan <> wrote:
    >
    > > Our application uses the caching heavily to store the data from
    > > databases & also the runtime orders information.
    > > All these caches are built on STL hash map, vectors & maps and data
    > > format is the structures.
    > > There are multiple threads accessing these caches simultaneously for
    > > reading the data as well as updating the data.
    > > Whenever any thread accesses the cache, it locks that cache & finds
    > > the required element. Does the actions required & unlocks the cache.
    > > I am finding that this is causing the other threads to wait for longer
    > > time because the locking is at cache level.
    > > I would like to make this locking quite finer & granular, in such a
    > > way that only the single structure which is to be updated is locked.
    > > This is somewhat same as the row level locking in databases.
    > > How do we achieve this level of granular locking?

    >
    > It depends. If you're not inserting or erasing on the map, then
    > you don't need a lock on it; only on the individual elements.
    > If you are (and if you're using it as a cache, you probably
    > are), then you need two locks for each transaction, one on the
    > element, and one on the table. Basically, you grab the lock on
    > the table, get the element, grab the lock associated with it,
    > and release the lock on the table. To insert, a lock on the
    > table is sufficient, but to erase, you need a lock on both the
    > table and the object, then erase the object from the map,
    > release the two locks, and delete.
    >
    > Note that as described above, the second thread trying to
    > acquire the lock on an already locked object will hold the table
    > locked. You might want to consider using pthread_mutex_trylock,
    > or its equivalent under your system, and backing off, releasing
    > the lock on the table, and trying again later, when you can't
    > get the lock. More complicated (and fairer) schemes are also
    > possible.
    >
    > If I were doing this, I'd probably wrap the table, so that the
    > user didn't have to worry about this, and have the getter return
    > a reference counted smart pointer which releases the lock on the
    > object when the last pointer (of the set of reference counted
    > smart pointers) disappears. Boost::shared_ptr works well for
    > this, provided you provide an appropriate destructor object when
    > you create the pointer.
    >
    > --
    > James Kanze (GABI Software) email:
    > Conseils en informatique orientée objet/
    > Beratung in objektorientierter Datenverarbeitung
    > 9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


    Great inputs & have helped me start thinking in bigger terms.
    Yes, I am using the hash map, etc from default available STL library
    on Linux (libstdc++). I am using them as memory cache for server
    application.
    Most of these caches are built using the string as key and the
    structure as data.
    Flow of functions happens somewhat as below in the application:
    Thread 1 gets data from database using select query. Populates that
    data in the structures & pushes that structure on the cache.
    Thread 2 then picks up that structure & uses it for getting other live
    real time data from other service. Once that real time data is
    available, this Thread 2 keeps on updating that data in the structures
    in the cache.
    There Thread 3 which reads the data from this cache & uses for
    processing the requests from the client application.

    So all types of data structure operations - insertions, erasing,
    updating, reading are happening on this cache with the strctures.
    The cache holds more than 60000 data elements (structures) & searching
    for each required structure takes long time. So when one thread is
    trying to get the required structure, others have to wait & cannot do
    any productive work. Because they also need data from the same cache
    (even though for different type of actions).
     
    grbgooglefan, Dec 5, 2007
    #4
  5. grbgooglefan

    Joe Seigh Guest

    Re: how to achieve thread safety at element level in STL data structureslike hash map, vectors

    grbgooglefan wrote:
    [...]
    > Great inputs & have helped me start thinking in bigger terms.
    > Yes, I am using the hash map, etc from default available STL library
    > on Linux (libstdc++). I am using them as memory cache for server
    > application.
    > Most of these caches are built using the string as key and the
    > structure as data.
    > Flow of functions happens somewhat as below in the application:
    > Thread 1 gets data from database using select query. Populates that
    > data in the structures & pushes that structure on the cache.
    > Thread 2 then picks up that structure & uses it for getting other live
    > real time data from other service. Once that real time data is
    > available, this Thread 2 keeps on updating that data in the structures
    > in the cache.
    > There Thread 3 which reads the data from this cache & uses for
    > processing the requests from the client application.
    >
    > So all types of data structure operations - insertions, erasing,
    > updating, reading are happening on this cache with the strctures.
    > The cache holds more than 60000 data elements (structures) & searching
    > for each required structure takes long time. So when one thread is
    > trying to get the required structure, others have to wait & cannot do
    > any productive work. Because they also need data from the same cache
    > (even though for different type of actions).


    The STL isn't really thread-safe. If a conventional rwlock doesn't
    solve your problem and scale well, you'll probably want to ask this
    question in comp.programming.threads.


    --
    Joe Seigh

    When you get lemons, you make lemonade.
    When you get hardware, you make software.
     
    Joe Seigh, Dec 5, 2007
    #5
  6. grbgooglefan

    yurec Guest

    Re: how to achieve thread safety at element level in STL datastructures like hash map, vectors

    On Dec 4, 4:22 am, grbgooglefan <> wrote:
    > Our application uses the caching heavily to store the data from
    > databases & also the runtime orders information.
    > All these caches are built on STL hash map, vectors & maps and data
    > format is the structures.
    > There are multiple threads accessing these caches simultaneously for
    > reading the data as well as updating the data.
    > Whenever any thread accesses the cache, it locks that cache & finds
    > the required element. Does the actions required & unlocks the cache.
    > I am finding that this is causing the other threads to wait for longer
    > time because the locking is at cache level.
    > I would like to make this locking quite finer & granular, in such a
    > way that only the single structure which is to be updated is locked.
    > This is somewhat same as the row level locking in databases.
    > How do we achieve this level of granular locking?
    > Thoughts please.


    Are you sure, that such thread safe cache which you try to implement
    by yourself will be more efficient, than using database (if it's on
    the same hdd as application)?As I know f.e. Oracle has very efficient
    row locks.Anyway I'm very interesting in such problem(i thought about
    such task a year ago, but the only thing, which came to my mind was to
    have some read-write maps in a bigger map.I found that it was bad idea
    and decided not to use such cache).If you get efficiency growth,
    please tell me (us) it in percents.
    Thanks in advance.
    Regards.
     
    yurec, Dec 5, 2007
    #6
  7. grbgooglefan

    James Kanze Guest

    Re: how to achieve thread safety at element level in STL datastructures like hash map, vectors

    On Dec 5, 3:12 am, grbgooglefan <> wrote:
    > On Dec 5, 12:35 am, James Kanze <> wrote:


    > Yes, I am using the hash map, etc from default available STL library
    > on Linux (libstdc++). I am using them as memory cache for server
    > application.
    > Most of these caches are built using the string as key and the
    > structure as data.
    > Flow of functions happens somewhat as below in the application:
    > Thread 1 gets data from database using select query. Populates that
    > data in the structures & pushes that structure on the cache.
    > Thread 2 then picks up that structure & uses it for getting other live
    > real time data from other service. Once that real time data is
    > available, this Thread 2 keeps on updating that data in the structures
    > in the cache.
    > There Thread 3 which reads the data from this cache & uses for
    > processing the requests from the client application.


    > So all types of data structure operations - insertions,
    > erasing, updating, reading are happening on this cache with
    > the strctures. The cache holds more than 60000 data elements
    > (structures) & searching for each required structure takes
    > long time. So when one thread is trying to get the required
    > structure, others have to wait & cannot do any productive
    > work. Because they also need data from the same cache (even
    > though for different type of actions).


    Something's wrong here. If searching for each required
    structure is what is taking the time, moving the locks down to
    the individual elements won't help unless you can avoid the lock
    on the map itself (which must be held during the search). And
    you can only do this if there are no insertions or erases in the
    map, ever. Look up in a hash table should be almost
    instantaneaous, however, if the hash function is effective. If
    the searching is what is taking time, you might try analysing
    your hash function, for the set of keys you are using. (The one
    used by default for strings with g++ has some known weaknesses,
    but it should be quite adequate for all but a few special
    cases.)

    Other than that, remember that if you don't hold the lock while
    you're waiting for other events (I/O, etc.), then using a lock
    per element, rather than a single lock for the entire structure,
    won't help, at least on a mono-processor system. Rather than
    using a lock per element, you might consider trying to
    reorganize the code so that you don't do this. Thread 2, for
    example, probably doesn't need to hold the lock (or maintain
    access) on the element while it's reading the updates. Only
    once it actually has the data for the update should it acquire
    the lock, get the element, modify it with the new data, and
    release the lock. Similarly, thread 1 should construct the
    entire new element before acquiring the lock to insert it. And
    thread 3 would acquire the lock, copy the data out, and release
    it, before using the data.

    Remember too that using a lock per element, rather than only a
    lock on the map, introduces a definite possibility of deadlock.
    If each thread only handles one element at a time, being careful
    to release it before acquiring the lock on another element,
    there should be no problem. Otherwise, deadlocks are a distinct
    possibility, and you have to take active steps to avoid them.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Dec 5, 2007
    #7
  8. grbgooglefan

    James Kanze Guest

    Re: how to achieve thread safety at element level in STL datastructures like hash map, vectors

    On Dec 5, 8:13 am, yurec <> wrote:
    > On Dec 4, 4:22 am, grbgooglefan <> wrote:


    > > Our application uses the caching heavily to store the data from
    > > databases & also the runtime orders information.
    > > All these caches are built on STL hash map, vectors & maps and data
    > > format is the structures.
    > > There are multiple threads accessing these caches simultaneously for
    > > reading the data as well as updating the data.
    > > Whenever any thread accesses the cache, it locks that cache & finds
    > > the required element. Does the actions required & unlocks the cache.
    > > I am finding that this is causing the other threads to wait for longer
    > > time because the locking is at cache level.
    > > I would like to make this locking quite finer & granular, in such a
    > > way that only the single structure which is to be updated is locked.
    > > This is somewhat same as the row level locking in databases.
    > > How do we achieve this level of granular locking?
    > > Thoughts please.


    > Are you sure, that such thread safe cache which you try to implement
    > by yourself will be more efficient, than using database (if it's on
    > the same hdd as application)?As I know f.e. Oracle has very efficient
    > row locks.Anyway I'm very interesting in such problem(i thought about
    > such task a year ago, but the only thing, which came to my mind was to
    > have some read-write maps in a bigger map.I found that it was bad idea
    > and decided not to use such cache).If you get efficiency growth,
    > please tell me (us) it in percents.


    It depends on whether the updates have to be persistent. From
    his description, I gather that this is not the case. Updating a
    record with Oracle always involves at least one disk write (and
    I don't think that Oracle has an option to turn off
    synchronization for this); updating a record in a hash table
    doesn't involve any. In the simplest case (i.e. incrementing a
    counter), this could mean a difference of several orders of
    magnitude: something like a microsecond or two (including the
    mutex lock) in memory, 10 milliseconds with the synchronized
    disk read.

    --
    James Kanze (GABI Software) email:
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
     
    James Kanze, Dec 5, 2007
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Philip V Pham

    STL (std) thread safety issues

    Philip V Pham, Jun 12, 2004, in forum: C++
    Replies:
    3
    Views:
    11,494
    Dietmar Kuehl
    Jun 14, 2004
  2. kl
    Replies:
    7
    Views:
    1,307
    James Kanze
    Jan 1, 2008
  3. Replies:
    6
    Views:
    1,063
    James Kanze
    Apr 5, 2008
  4. navS
    Replies:
    3
    Views:
    527
    Ismo Salonen
    May 9, 2008
  5. rp
    Replies:
    1
    Views:
    556
    red floyd
    Nov 10, 2011
Loading...

Share This Page