Thread-safe reference counts.

C

Chris Thomasson

David Schwartz said:
I agree. I would put it simply -- you cannot call 'AddRef' unless you
have an explicit or implicit reference to an object. The 'AddRef'
function is not special, it must be called with a reference just like
every other function.

The puzzle is this -- how did you get a pointer to object to call
'AddRef' on anyway?

This paper explains the puzzle, and a soultion:

http://citeseer.ist.psu.edu/cache/p...zSzpaperszSz01-podc.pdf/detlefs01lockfree.pdf



I've heard a lot of talk about strong thread safety and the like, but
I have to admit, I don't get it. In order to call 'AddRef' on an
object, you need a pointer to it, and how could you possibly have
gotten that pointer without something that already had a reference?

Think of the following scenario:
__________________________________________________
static atomic_ptr<foo> g_foo;

void writers() {
for(;;) {
local_ptr<foo> l_foo(new foo);
g_foo = l_foo;
}
}


void readers() {
for(;;) {
local_ptr<foo> l_foo(g_foo);
if (l_foo) {
l_foo->do_something();
}
}
}
__________________________________________________




Notice how the reader thread can grab a reference from 'g_foo' without
having a previous reference to an object contained within it? You can
download the sourcecode of atomic_ptr from:

http://sourceforge.net/project/showfiles.php?group_id=127837
(atomic-ptr-plus package)


Also, you can take a look at the source code for my proxy garbage collector:

http://appcore.home.comcast.net/misc/pc_sample_h_v1.html

The function that allows any thread to grab a reference to the current
region is 'pc_acquire()':
__________________________________________________
pc_region*
pc_acquire(
pc_master* const _this
) {
pc_sys_anchor cmp = _this->head, xchg;
do {
xchg.refcnt = cmp.refcnt + 2;
xchg.region = cmp.region;
} while (! DWCASPTR(&_this->head, &cmp, &xchg));
return cmp.region;
}
__________________________________________________




Notice how it updates the counter and grabs a pointer to the current region
in a single atomic operation (e.g., DWCAS-loop)? This is another solution to
your puzzle.



The existence of a pointer should mean the existence of a reference --
otherwise how can you know that pointer remains valid, whether a call
for AddRef or for any other purpose?

You need to load a pointer and increment the reference count in a single
atomic operation.
 
C

Chris Thomasson

David Schwartz said:
I agree. I would put it simply -- you cannot call 'AddRef' unless you
have an explicit or implicit reference to an object. The 'AddRef'
function is not special, it must be called with a reference just like
every other function.

The puzzle is this -- how did you get a pointer to object to call
'AddRef' on anyway?

I've heard a lot of talk about strong thread safety and the like, but
I have to admit, I don't get it. In order to call 'AddRef' on an
object, you need a pointer to it, and how could you possibly have
gotten that pointer without something that already had a reference?

The existence of a pointer should mean the existence of a reference --
otherwise how can you know that pointer remains valid, whether a call
for AddRef or for any other purpose?

Here is some code you can look at which implements strongly thread-safe
reference counting:


http://groups.google.com/group/comp.programming.threads/browse_frm/thread/d30ad3154c08d9dd


The 'foo_obj_acquire/release()' functions is where all the magic take place:
______________________________________________________________________
foo_obj* foo_obj_acquire(foo_obj** psrc) {
foo_obj* _this;
pc_region* const pcr = pc_acquire(&g_pcm);
if (_this = LOADPTR(psrc)) {
atomicword cmp = _this->refcnt;
do {
if (! cmp) {
_this = NULL;
break;
}
} while(! CASWORD(&_this->refcnt, &cmp, cmp + 1));
}
pc_release(pcr);
return _this;
}


void foo_obj_release(foo_obj* _this) {
if (XADDWORD(&_this->refcnt, -1) == 1) {
pc_mutate(&g_pcm, &_this->pcn);
}
}
______________________________________________________________________




Any thread can call 'foo_obj_acquire()' to grab a reference from a pointer
to a 'foo_obj*'. The following setup is perfectly legal:
_____________________________________________________________________
static foo_obj* g_foo = NULL;


void reader_threads(void) {
for (;;) {
foo_obj* const _this = foo_obj_acquire(&g_foo);
if (_this) {
foo_obj_release(_this);
}
}
}


void writer_threads(void) {
for (;;) {
foo_obj* const _this = foo_obj_ctor();
if (_this) {
foo_obj* prev = foo_obj_swap(&g_foo, _this);
if (prev) {
foo_obj_release(prev);
}
}
}
}
_____________________________________________________________________



reader threads can acquire reference's to 'foo_obj' objects on the fly,
without having to own a previous reference to them.
 
C

coal

On Mar 27, 9:42 pm, (e-mail address removed) wrote:

Use an std::vector (or boost::array) instead of std::deque or
std::list for applications where serialization performance is
critical.

If it is vector of non POD I think the test results would be
simlar to the list<> results. I can test that if need be.
Both Boost (1.35 if I'm not mistaken) and Ebenezer (since
2006) have optimizations wrt vectors of built-in types like
vector<int>. I disagree with advice that says avoid deques
and lists if they would be used in a serialization context
where performance is important. Serialization is just one
factor that should be considered in choosing the best container for a
given context.

I'm glad you mentioned boost::array. It had slipped off my
radar and I want to investigate the possibility of supporting
it.

Brian
 
C

Chris Thomasson

Chris Thomasson said:
AFAICT, it seems like they are seriously considering making shared_ptr
strongly thread-safe. Well, IMVHO, that would greatly enhance its
functionality. Nice!

Thanks for pointing this out.

I seems like they are going for something that is analogous to Java
references. Finally, strong thread-safety is being taken seriously.
 
J

James Kanze

I think some Boost libraries are better than others though.

Certainly. And none of them (no more than any other code
written by human beings) are perfect. In general, though, the
more intensely used the library, the better the chances are that
it is pretty good, or even better. The Boost smart pointers are
amongst the most widely used Boost library, and should certainly
be your first choice if you need smart pointers. (I think that
smart pointers are often overused, but that's a different
issue.) After that, as with any library, you may find that they
don't exactly meet your needs, and you need a custom component.
But writing one yourself until you're sure that it's necessary
is just stupid. (Especially if you're looking for something
very difficult to get right, like a reference counted pointer
which works in a multi-threaded environment.)
 
C

coal

Certainly.  And none of them (no more than any other code
written by human beings) are perfect.  In general, though, the
more intensely used the library, the better the chances are that
it is pretty good, or even better.  The Boost smart pointers are
amongst the most widely used Boost library, and should certainly
be your first choice if you need smart pointers.  

OK, but auto_ptr should be mentioned and considered before
the Boost options.
(I think that
smart pointers are often overused, but that's a different
issue.)

I agree that they're used too much.


Brian
 
J

Jon Harrop

I have some code where objects are dynamically allocated by some
thread, and then used by multiple threads and freed when they are no
longer needed. I think that reference counting is the most appropriate
way to handle this in my situation.

A concurrent garbage collector is the appropriate way to handle that
situation.
 
C

Chris Thomasson

Jon Harrop said:
A concurrent garbage collector is the appropriate way to handle that
situation.

Why? There are many different ways to handle lifetime management.
 
C

Chris Thomasson

Jon Harrop said:
They are an order of magnitude slower and leak on cycles.

What type of counting algorithm are you talking about? I know a certain type
of distributed reference counting can incur virtually zero-overheads. There
is also deferred counting, weighted counting, ect...
 
J

James Kanze

One possible way, reference counted objects work just as well.

That's wrong on two counts: first, reference counted objects
fail when cycles are present (which is almost always the case in
my code), and of course reference counting is more expensive in
terms of CPU time, especially in a multithreaded environment.
The latter is, of course, rarely a real problem. The first,
however, pretty much means that there are cases where reference
counting can't be used.

If the problem is freeing memory, especially in a multithreaded
environment, using the Boehm collector with C++ is definitly the
simplest solution. In his case, it sounds like
boost::shared_ptr would also be an effective solution. The
additional runtime will probably not be noticeable, and from his
very general description, it doesn't should like there would be
cycles, or if there were, they should be easy to break. (If
he's currently using neither, it's definitely easier to adopt
boost::shared_ptr---the Boehm collector requires some non
trivial configuration. On the other hand, I find it worthwhile
to invest the effort, since the results are generally useful.)
 
J

James Kanze

Nonsense, show us the data.

Well, they obviously leak if cycles are present. And while "an
order of magnitude slower" is obviously false as well, Hans
Boehm has performed a number of benchmarks, and they are slower
in a lot of cases.

A lot depends on how you use dynamic memory. The runtime of a
typical collector depends on the amount of memory actually in
use when the garbage collector runs; the runtime of a typical
manual memory allocator depends on the number of allocations and
frees. An application which uses a lot of small, short lived
objects, and disposes of adequate memory, will run significantly
faster with garbage collection. An application which only
allocates a few, very big object, will run faster with manual
collection or shared pointers.
 
J

James Kanze

Why? There are many different ways to handle lifetime management.

Just to make it clear: the original posting talked about memory
management, not object lifetime management. From the way it was
presented, it more or less sounded like object lifetime had no
significance (often the case). In which case, garbage
collection is fine. But garbage collection has nothing to do
with object lifetime (which, when significant, still must be
managed, garbage collection or not).

The original poster presented a very concrete problem, for which
garbage collection is probably the best solution, but for which
boost::shared_ptr would probably work almost as well. If he's
already using one, but not the other, then the solution is
obvious. If he's using both, I'd go with garbage collection:
less complexity and less actual coding. If he's using neither,
it depends. Boost::shared_ptr is definitly easier to introduce
into a project, but garbage collection is likely to have more
long term benefits.
 
J

Jon Harrop

Ian said:
Nonsense, show us the data.

Read Jones and Lins "Garbage Collection" and references therein for a start.

Reference counting is very slow because updating reference counts requires
cache incoherent memory access, which is very slow on modern hardware and
is getting relatively slower every year as CPU speed increases faster than
memory speed. The slowdown is already enormous.
 
A

Alexander Terekhov

Jon said:
Read Jones and Lins "Garbage Collection" and references therein for a start.

Reference counting is very slow because updating reference counts requires
cache incoherent memory access, which is very slow on modern hardware and

"cache incoherent"? :)
is getting relatively slower every year as CPU speed increases faster than
memory speed. The slowdown is already enormous.

You seem to presume heavy contention while copying and
modifying/destroying shared pointers (maintaining reference counts).

If that is the case, then running transaction in a separate address
space without any managed memory reclamation is surely best. No garbage
collection is needed. ;-)

regards,
alexander.
 
J

Jon Harrop

Alexander said:
James Kanze wrote:
[...]
That's wrong on two counts: first, reference counted objects
fail when cycles are present (which is almost always the case in
my code), ...

See

http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2007/n2297.html#cycles

Yes, you can work around it by augmenting reference counting with other
techniques or even replace it wholesale. Persue that idea and a ream of
other improvements for three decades and you end up with a modern GC.

I'm amazed anyone would bother putting a really poor GC in C++ rather than
using a more appropriate language but, IMHO, C++0x is the next Fortran 2003
anyway...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top