Re: multithreaded cache?

Discussion in 'Java' started by Kevin McMurtrie, May 25, 2012.

  1. In article <>,
    bugbear <bugbear@trim_papermule.co.uk_trim> wrote:

    > I'm using various Apache Commons Maps as a
    > multithread cache, protected using
    > ReentrantReadWriteLock, so that getting() uses a read lock,
    > and putting() uses a write lock.
    >
    > But I've got an issue; in the
    > case of a cache miss (protected by a read lock),
    > the required value is acquired using the "underlying function"
    > that the cache is over; this value is then put() into
    > the cache (protected by a write lock)
    >
    > This is all perfectly thread safe, and gives
    > correct results.
    >
    > However, if the underlying function is slow
    > and/or resource hungry (consider cacheing
    > a ray traced image!) many threads can
    > end up calling the real function (second
    > and subsequent threads to the first get a miss
    > during the first threads call to the underlying function).
    >
    > "obviously..." what I want is for only
    > the thread that FIRST has a cache miss
    > calls the underlying function, whilst other
    > threads (for the same key) wait.
    >
    > This seems "obvious" enough that I'm guessing
    > there's a standard solution.
    >
    > Googling led me to several "heavy" libraries;
    >
    > This appears more a locking/cacheing issue
    > than a Java issue, although my implementation
    > is in Java.
    >
    > Can anyone (please) point me at a
    > canonical solution/implementation?
    >
    > BugBear


    Ha, this is an interview question that I use.

    What you need is row level locking for the cache load.

    Step 1)
    Use synchronized operations to map your key to a value; creating an
    uninitialized value in the map if needed. Use whatever tech you want.
    A synchronized block on a HashMap is simplest and performs the fastest
    on 1 or 2 core systems. A ConcurrentHashMap sometimes performs better
    with 4+ core systems.

    Step 2)
    Synchronize on the value. Initialize it if needed.


    Step 1 blocks all cache access for only for a very short moment to make
    sure that a key always has a value. Step 2 blocks access independently
    for each cache value to make sure that it is loaded. It will perform
    well for continuous use by several CPU cores. Google has some high
    concurrency Maps that aren't too bad either.

    In the 16 core range you'll find that any kind of exclusive lock causes
    stalls where threads suspend while holding locks, causing a backlog that
    reinforces itself. A concurrency expert can fix that using complex
    Compare-And-Swap designs. In the 32+ core range you'll sometimes find
    that harmless race conditions are better than multiple threads
    frequently writing to the same memory page.
    --
    I will not see posts from Google because I must filter them as spam
    Kevin McMurtrie, May 25, 2012
    #1
    1. Advertising

  2. On 25.05.2012 08:22, Kevin McMurtrie wrote:
    > Ha, this is an interview question that I use.
    >
    > What you need is row level locking for the cache load.


    But not all the time. See https://gist.github.com/2717818

    > Step 1)
    > Use synchronized operations to map your key to a value; creating an
    > uninitialized value in the map if needed. Use whatever tech you want.
    > A synchronized block on a HashMap is simplest and performs the fastest
    > on 1 or 2 core systems. A ConcurrentHashMap sometimes performs better
    > with 4+ core systems.


    In my experience a CHM performs better even on a 1 or 2 core system.

    > Step 2)
    > Synchronize on the value. Initialize it if needed.
    >
    >
    > Step 1 blocks all cache access for only for a very short moment to make
    > sure that a key always has a value. Step 2 blocks access independently
    > for each cache value to make sure that it is loaded. It will perform
    > well for continuous use by several CPU cores. Google has some high
    > concurrency Maps that aren't too bad either.


    Actually once the value is in the cache you do not need any step 2
    synchronization any more.

    > In the 16 core range you'll find that any kind of exclusive lock causes
    > stalls where threads suspend while holding locks, causing a backlog that
    > reinforces itself. A concurrency expert can fix that using complex
    > Compare-And-Swap designs.


    Basically this is what CHM does with putIfAbsent() internally.

    Kind regards

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
    Robert Klemme, May 25, 2012
    #2
    1. Advertising

  3. In article <>,
    Robert Klemme <> wrote:

    > On 25.05.2012 08:22, Kevin McMurtrie wrote:
    > > Ha, this is an interview question that I use.
    > >
    > > What you need is row level locking for the cache load.

    >
    > But not all the time. See https://gist.github.com/2717818


    The original post stated that duplicate element creation was too
    expensive. That code may create duplicate elements for a key and
    discard the extras.


    >
    > > Step 1)
    > > Use synchronized operations to map your key to a value; creating an
    > > uninitialized value in the map if needed. Use whatever tech you want.
    > > A synchronized block on a HashMap is simplest and performs the fastest
    > > on 1 or 2 core systems. A ConcurrentHashMap sometimes performs better
    > > with 4+ core systems.

    >
    > In my experience a CHM performs better even on a 1 or 2 core system.


    This depends on usage patterns. The Java 1.5 Concurrency locks used in
    some parts of CHM are very slow compared to synchronization. The
    Concurrency locks only have a chance of performing better when there are
    a lot of concurrent shared read locks.


    > > Step 2)
    > > Synchronize on the value. Initialize it if needed.
    > >
    > >
    > > Step 1 blocks all cache access for only for a very short moment to make
    > > sure that a key always has a value. Step 2 blocks access independently
    > > for each cache value to make sure that it is loaded. It will perform
    > > well for continuous use by several CPU cores. Google has some high
    > > concurrency Maps that aren't too bad either.

    >
    > Actually once the value is in the cache you do not need any step 2
    > synchronization any more.


    Step 2 is initializing the value. It's critical to synchronize there or
    the second thread to request the same key may see an element that is not
    yet fully initialized. (Again, the original post stated that elements
    were expensive to create.)

    >
    > > In the 16 core range you'll find that any kind of exclusive lock causes
    > > stalls where threads suspend while holding locks, causing a backlog that
    > > reinforces itself. A concurrency expert can fix that using complex
    > > Compare-And-Swap designs.

    >
    > Basically this is what CHM does with putIfAbsent() internally.


    Yes, but you don't have access to its internal container class so CHM is
    not much use for caching elements that are very expensive to create.
    The Googlely version does provide a way to initialize the container with
    a factory callback; combining steps 1 and 2 here.

    >
    > Kind regards
    >
    > robert

    --
    I will not see posts from Google because I must filter them as spam
    Kevin McMurtrie, Jun 2, 2012
    #3
  4. On 02.06.2012 21:27, Kevin McMurtrie wrote:
    > In article<>,
    > Robert Klemme<> wrote:
    >
    >> On 25.05.2012 08:22, Kevin McMurtrie wrote:
    >>> Ha, this is an interview question that I use.
    >>>
    >>> What you need is row level locking for the cache load.

    >>
    >> But not all the time. See https://gist.github.com/2717818

    >
    > The original post stated that duplicate element creation was too
    > expensive. That code may create duplicate elements for a key and
    > discard the extras.


    No, it does not create duplicate elements and hence does not discard
    them. The only thing which might be duplicated is the proxy instance
    which calls the factory method (created in line 98). But that is cheap.
    The design ensures that a value is only calculated once per key and
    neither CPU is wasted nor duplicate value objects. Please see Lew's
    excellent explanation of what happens or look at the code again.

    >>> Step 1)
    >>> Use synchronized operations to map your key to a value; creating an
    >>> uninitialized value in the map if needed. Use whatever tech you want.
    >>> A synchronized block on a HashMap is simplest and performs the fastest
    >>> on 1 or 2 core systems. A ConcurrentHashMap sometimes performs better
    >>> with 4+ core systems.

    >>
    >> In my experience a CHM performs better even on a 1 or 2 core system.

    >
    > This depends on usage patterns. The Java 1.5 Concurrency locks used in
    > some parts of CHM are very slow compared to synchronization. The
    > Concurrency locks only have a chance of performing better when there are
    > a lot of concurrent shared read locks.


    Even if they are slower than synchronization they perform better than a
    single lock on the whole Map (what you suggested above) with multiple
    threads even on a few core machine just because there are more of them
    and the whole Map is partitioned. I haven't strictly measured the
    synchronization mechanisms used inside CHM but I did also not notice bad
    timing (and frankly, I cannot believe Doug Lea would have used
    inefficient mechanisms). Bottom line: when in doubt measure.

    >>> Step 2)
    >>> Synchronize on the value. Initialize it if needed.
    >>>
    >>>
    >>> Step 1 blocks all cache access for only for a very short moment to make
    >>> sure that a key always has a value. Step 2 blocks access independently
    >>> for each cache value to make sure that it is loaded. It will perform
    >>> well for continuous use by several CPU cores. Google has some high
    >>> concurrency Maps that aren't too bad either.

    >>
    >> Actually once the value is in the cache you do not need any step 2
    >> synchronization any more.

    >
    > Step 2 is initializing the value. It's critical to synchronize there or
    > the second thread to request the same key may see an element that is not
    > yet fully initialized. (Again, the original post stated that elements
    > were expensive to create.)


    Maybe my wording was not clear enough. Once the value is in the Map
    synchronization is needed no more for this key and only the Map level
    locking remains. Again, please look at the code.

    >>> In the 16 core range you'll find that any kind of exclusive lock causes
    >>> stalls where threads suspend while holding locks, causing a backlog that
    >>> reinforces itself. A concurrency expert can fix that using complex
    >>> Compare-And-Swap designs.

    >>
    >> Basically this is what CHM does with putIfAbsent() internally.

    >
    > Yes, but you don't have access to its internal container class so CHM is
    > not much use for caching elements that are very expensive to create.


    I beg to differ (see code).

    > The Googlely version does provide a way to initialize the container with
    > a factory callback; combining steps 1 and 2 here.


    Which version? Please share a reference. Thank you!

    Kind regards

    robert

    --
    remember.guy do |as, often| as.you_can - without end
    http://blog.rubybestpractices.com/
    Robert Klemme, Jun 2, 2012
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jeff Nokes

    Cache::Cache Stale Segments

    Jeff Nokes, Sep 30, 2003, in forum: Perl
    Replies:
    0
    Views:
    554
    Jeff Nokes
    Sep 30, 2003
  2. DesignerX

    Page.Cache vs HttpContext.Current.Cache

    DesignerX, Jan 20, 2004, in forum: ASP .Net
    Replies:
    5
    Views:
    8,222
    vMike
    Jan 20, 2004
  3. Silvio Bierman

    Re: multithreaded cache?

    Silvio Bierman, May 15, 2012, in forum: Java
    Replies:
    10
    Views:
    535
    Mike Schilling
    May 26, 2012
  4. Eric Sosman

    Re: multithreaded cache?

    Eric Sosman, May 15, 2012, in forum: Java
    Replies:
    0
    Views:
    462
    Eric Sosman
    May 15, 2012
  5. Robert Klemme

    Re: multithreaded cache?

    Robert Klemme, May 17, 2012, in forum: Java
    Replies:
    20
    Views:
    641
    Robert Klemme
    May 25, 2012
Loading...

Share This Page