Thread-safe reference counts.

C

Chris Thomasson

Garbage collection detects when an object can no longer be
referred to. Obviously, in this case, it's lifetime has ended,
since there is no longer any way to end it otherwise.

Right. The object is now quiescent.



But
that's about it---garbage collection will not collect the memory
underneath the object that is still alive. Provided, of course,
that there are not errors in the code, and you've lost the last
pointer to the object before its lifetime has ended. But that
has nothing to do with object lifetime per se.

IMHO, the ability to determine when an object is quiescent is a major part
of lifetime management as a whole.



Most objects have indeterminate lifetime, and it doesn't matter.
If an object has determinate lifetime, however, you have to
manage it, garbage collection or no. Garbage collection does
NOT manage object lifetime; it only manages memory.

Agreed. However, GC is a great tool to use in a lifetime management scheme
for dynamic objects.
 
J

Jon Harrop

Chris said:
You point is that anything that uses any type of reference counting
whatsoever is slow, outdated, crappy... Either your a troll, or your a GC
religious freak.

You claimed that reference counting is "100% accurate" and always more
accurate than a GC. I knew that was wrong so I explained why and gave a
worked counter example proving my point.
 
C

Chris Thomasson

Jon Harrop said:
You claimed that reference counting is "100% accurate" and always more
accurate than a GC. I knew that was wrong so I explained why and gave a
worked counter example proving my point.

Some reference counting algorithms are 100% accurate in that an object is
IMMEDIATELY destroyed when its count drops to zero.. Other algorithms are
not because they use amortizations and deferments much like a traditional GC
does.


struct Bar {
Bar(int refs_) : refs(refs_) {}
void addref() { ++refs; };
void release() { --refs; if (! refs) { delete this; }}
int refs;
};


Here is your scoped example again:

{
0: Bar* bar(new Bar(1));
1: f(bar);
2: bar->release();
3: g();
}


the 'bar' object is immediately destroyed on 'line 2'. In order for a GC to
do this, well, it would have to perform a collection exactly on 'line 2'.
Traditional GC cannot ever be 100% accurate all the time because that would
imply that it runs collections non-stop. AFAICT, you don't understand the
implementation details of reference counting very well because most of your
claims about them are false.
 
J

Jon Harrop

Chris said:
Some reference counting algorithms are 100% accurate in that an object is
IMMEDIATELY destroyed when its count drops to zero.

A 100% accurate collector will collect a value as soon as it is unreachable.
Reference counting does not do that.
the 'bar' object is immediately destroyed on 'line 2'. In order for a GC
to do this, well, it would have to perform a collection exactly on 'line
2'. Traditional GC cannot ever be 100% accurate all the time because that
would imply that it runs collections non-stop.
Yes.

AFAICT, you don't understand the implementation details of reference
counting very well because most of your claims about them are false.

In the face of a proven counter example your ad-hominem attacks are wearing
thin. Please just swallow it, kiss your fallacy goodbye and move on.
 
C

Chris Thomasson

Jon Harrop said:
A 100% accurate collector will collect a value as soon as it is
unreachable.
Reference counting does not do that.

When the reference count is zero, it is unreachable and it is immediately
destroyed. A GC cannot do that unless it is constantly running collection
cycles.




What GC out there runs collections cycles 100% of the time?



In the face of a proven counter example your ad-hominem attacks are
wearing
thin. Please just swallow it, kiss your fallacy goodbye and move on.

You make false claims about reference counting, I point them out, and you
say I am attacking you? Too funny.

:^D
 
C

Chris Thomasson

Chris Thomasson said:
Jon Harrop said:
Alexander said:
Jon Harrop wrote:
[...]
I'm amazed anyone would bother putting a really poor GC in C++ rather
than

shared_ptr<> isn't really a GC. And if you view it from the GC angle, it
is indeed really poor.

What does it do apart from collect garbage?

shared_ptr<> destroys objects as soon as the reference count drops to one;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

shared_ptr<> destroys objects as soon as the reference count drops to ZERO!

ARGH!
 
A

Alexander Terekhov

Jon said:
Alexander said:
Jon Harrop wrote:
[...]
I'm amazed anyone would bother putting a really poor GC in C++ rather
than

shared_ptr<> isn't really a GC. And if you view it from the GC angle, it
is indeed really poor.

What does it do apart from collect garbage?

It executes deleters (dtors) ending objects lifetime deterministically
and lets allocator collect garbage (which may well be a GC for the later
part, BTW).

regards,
alexander.
 
C

Chris Thomasson

Chris Thomasson said:
What about an application that allocates a couple of bug buffers and
carves
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Umm... Well, crap! I should have said:


What about an application that allocates a couple of BIG buffers and carves

individual smaller objects out of them?




Lol! 'bug' buffers is funny! My typos/BUG-mistakes make me crack up
sometimes!


I am surprised that I did not get seriously flames to a point where I needed
to go to the hospital, and get hooked up to some sort of opiate drip.


:^)
 
J

James Kanze

Right. The object is now quiescent.

I've not heard this term in relationship to object lifetime
management before. Generally speaking, I find that there are
two categories of objects, those with (or which need) a
deterministic lifetime, and those which don't. By "needing" a
deterministic lifetime", I mean that there is significant
(observable, in the sense of the C++ standard) behavior
associated with the end of their lifetime. If you loose all
pointers to the object before that behavior occurs, then you
have a serious error in your program. If you try to use the
object after that behavior has occured, then you also have a
serious error. Conceptually, for such objects, lifetime is
independent of the number of pointers to the object, although in
practice, you need to keep at least one pointer in order to
invoke the termination, and you don't want to keep pointers too
long after the object has been terminated, since any pointer to
the "dead" object is a potential source of errors. For such
objects, garbage collection isn't really that important in a
correct program, because freeing the memory at the point of
termination is always "correct". Garbage collection does help
in assertions against misuse, since it ensures that the memory
will not be used for something else as long as anyone still
holds a pointer to it. What I've often wished for in such cases
is some sort of reverse garbage collection: getting rid of the
pointers to the object because the object should no longer be
referenced. In practice, however, there doesn't seem to be any
effective generic means of "getting rid of the pointers": if the
pointer is in a map, for example, you have to remove the map
entry.

For objects with indeterminate lifetimes, of course, garbage
collection makes for less implementation code, so saves you
programming time.
IMHO, the ability to determine when an object is quiescent is
a major part of lifetime management as a whole.

Object lifetime is part of the design phase. If the design
specifies a determinate lifetime, you have to program it.
Regardless of the language---in this case, Java and C++ are no
different. If the design determines that it doesn't, *but* you
still need dynamic allocation (a lot of objects which don't have
pure value semantics, and should be copied, and not dynamically
allocated), then garbage collection saves you programming work.
But it never frees you of the responsibility of considering
object lifetime in design---it only intervenes at a much lower
level.
Agreed. However, GC is a great tool to use in a lifetime
management scheme for dynamic objects.

Not for lifetime management. For memory management, once you've
determined that the lifetime is indeterminate (or after an
object with a determinate lifetime has been terminated).
 
D

Dmitriy V'jukov

This seems an awful lot of work to create a problem followed by an
awful lot of work to solve it. Just never allow a pointer to exist
without a reference. It seems illogical to allow this anyway, since
the pointer can't be guaranteed to point anywhere useful anyway.

If you get a pointer from somewhere, that somewhere must have the
pointer. Thus it must have a reference -- otherwise the pointer might
be invalid. Since the pointer is accompanied by a reference, there is
no risk that the pointer can become invalid until you get your
reference. All you have to do is sequence the operations that require
or involve the reference that accompanies that copy of the pointer.


Consider following example:

class application_settings;
application_settings* g_settings;

// replaces g_settings with new_settings,
// and releases previous value of g_settings
void update_settings(application_settings* new_settings);

// acquires and returns object stored in g_settings
application_settings* acquire_settings();

// releases settings object
void release_settings(application_settings* settings);

void reader_thread()
{
for(;;)
{
application_settings* settings = acquire_settings();
// use settings
release_settings(settings);
}
}

In this example acquire_settings() must act on pointer to object for
which it doesn't have reference yet.

How this pattern must be implemented in the right way (pointer ==
reference)?


Dmitriy V'jukov
 
D

David Schwartz

It does not have to use locks. Have you even looked at the source code to
Joe Seighs atomic pointer?

No, it does not have to use a lock, it can work any way it wants to.
I'm talking conceptually here.

DS
 
D

David Schwartz

Consider following example:

class application_settings;
application_settings* g_settings;

// replaces g_settings with new_settings,
// and releases previous value of g_settings
void update_settings(application_settings* new_settings);

// acquires and returns object stored in g_settings
application_settings* acquire_settings();

// releases settings object
void release_settings(application_settings* settings);

void reader_thread()
{
for(;;)
{
application_settings* settings = acquire_settings();
// use settings
release_settings(settings);
}

}

In this example acquire_settings() must act on pointer to object for
which it doesn't have reference yet.

Huh? Of course it does. You just keep one reference for whatever is
the currently valid settings.
How this pattern must be implemented in the right way (pointer ==
reference)?

Wherever 'acquire_settings' keeps a pointer to the current settings,
it also keeps a reference. Like this:
// replaces g_settings with new_settings,
// and releases previous value of g_settings
void update_settings(application_settings* new_settings)
{
new_settings->AddRef();
Lock();
if(g_settings!=NULL) g_settings->UnRef();
g_settings=new_settings;
Unlock();
}
// acquires and returns object stored in g_settings
application_settings* acquire_settings()
{
application_settings *ret=NULL;
Lock();
if(g_settings!=NULL)
{
g_settings->AddRef();
ret=g_settings();
}
Unlock();
return ret;
}
// releases settings object
void release_settings(application_settings* settings)
{
Lock();
if(g_settings!=NULL)
{
g_settings->UnRef();
g_settings=NULL;
}
Unlock();
}

Note that 'Lock' and 'Unlock' are notional. They can be a global lock
but they can also be atomic operations, a lock specific to the
implementation of g_settings, or anything else you like.

See how threads are not the only thing that can have a reference? The
global 'g_settings' pointer (and its associated lock if it has one)
can also have a reference.

DS
 
D

Dilip

A 100% accurate collector will collect a value as soon as it is unreachable.
Reference counting does not do that.

Unreachability is determined only _after_ the GC is triggered.
Exactly _when_ GC is triggered is determined by allocation patterns of
your application and the general memory pressure it causes. In other
words its non-determinsitic -- you have no idea when that object will
be collected. Reference counting on the other hand, barring cycles,
can destroy that object _immediately_ after the count reaches zero.
COM has been doing this for ages.

I am with Chris here. You are making absolutely ridiculous claims.
 
D

Dmitriy V'jukov

Note that 'Lock' and 'Unlock' are notional. They can be a global lock
but they can also be atomic operations, a lock specific to the
implementation of g_settings, or anything else you like.


Try to sketch implementation based on atomic operations to feel the
difference between basic and strong thread safety.

See how threads are not the only thing that can have a reference? The
global 'g_settings' pointer (and its associated lock if it has one)
can also have a reference.


You are saying that things are simple, just follow the rule
'pointer==reference'.
Well, if I have pointer to object then things are really simple. But
problem in the example that I don't have pointer to object. I have
pointer to pointer to object. In this situation things are not so
simple. Because I need basically 2 reference counters (or mutexes):
one for object and one for pointer to object.

This is difference between basic thread safety (when you can use
single atomic_inc() for acquire operation) and between strong thread
safety (when you can't).

Both useful in practice.

So I would not say that it's just "an awful lot of work to create a
problem".


Dmitriy V'jukov
 
D

David Schwartz

Try to sketch implementation based on atomic operations to feel the
difference between basic and strong thread safety.

Why would I want to do such a ridiculous thing? I write portable code
to international standards. Why would I want to rewrite my code just
because a new CPU came out or something like that?
You are saying that things are simple, just follow the rule
'pointer==reference'.

Yes, that's what I'm saying.
Well, if I have pointer to object then things are really simple. But
problem in the example that I don't have pointer to object. I have
pointer to pointer to object. In this situation things are not so
simple. Because I need basically 2 reference counters (or mutexes):
one for object and one for pointer to object.

If you have a pointer to a pointer to an object, that pointer has a
pointer to an object and you have a pointer to that pointer. You can
do this with two reference counts if you want. You may also be able to
do it with one. It depends on the exact circumstances.

As for having two mutexes, so what?
This is difference between basic thread safety (when you can use
single atomic_inc() for acquire operation) and between strong thread
safety (when you can't).

Both useful in practice.

So I would not say that it's just "an awful lot of work to create a
problem".

I still don't get it. Why not just always keep a reference with every
pointer? Is the only issue performance? Is this whole thing about
premature optimization?

DS
 
C

Chris Thomasson

David Schwartz said:
No, it does not have to use a lock, it can work any way it wants to.
I'm talking conceptually here.

Using locks makes it work within the scope of the POSIX Standard, which is
very good. Here is an example:

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/177609a5eff4a466

IMHO, it's nice that the PThread standard is flexible enough to support this
narrow type of usage pattern. As you very well know, locks are _very_ useful
indeed.
 
C

Chris Thomasson

[...]

My only point is that, IMVHO of course, GC can be a wonderful tool for all
sorts of lifetime management schemes. This due to the fact that it can
inform the management protocol when an object is in a quiescent state. IMO,
lifetime management "basically implies" that an object will only be
destroyed when its unreachable; if it never dies, so be it. If it can die,
well, knowing when it's quiescent is a _very_ important piece of information
indeed. You don't really want to terminate an object if it is still able to
be accessed by other entities. Therefore, I conclude that GC and/or
reference counting can be important tools for a lifetime management scheme
to use in general.

Does that make any sense?
 
K

Kai-Uwe Bux

Chris said:
[...]

My only point is that, IMVHO of course, GC can be a wonderful tool for all
sorts of lifetime management schemes. This due to the fact that it can
inform the management protocol when an object is in a quiescent state.
IMO, lifetime management "basically implies" that an object will only be
destroyed when its unreachable; if it never dies, so be it. If it can die,
well, knowing when it's quiescent is a _very_ important piece of
information indeed. You don't really want to terminate an object if it is
still able to be accessed by other entities. Therefore, I conclude that GC
and/or reference counting can be important tools for a lifetime management
scheme to use in general.

Does that make any sense?

Sure. However, often it is the other way around: the object knows when its
job is done and it has to die. Then the object must notify all interested
parties that it is about to self-detonate. In those cases, GC only serves
as a bug-hunting tool flagging a missing notification or a notified party
that did not remove its handle on the object.

The point is that it is not necessarily the objects mistake to
self-destruct. It is often more likely that it is a clients fault to expect
the object to be still alive. If GC keeps an object around that is
semantically supposed to be dead, it can hide a logic bug in the program.


Best

Kai-Uwe Bux
 
D

Dmitriy V'jukov

I still don't get it. Why not just always keep a reference with every
pointer? Is the only issue performance? Is this whole thing about
premature optimization?


If you mean difference between implementation of strong thread safety
based on additional persistent mutex and based on atomic operation -
Yes. Implementation based on atomic operations doesn't give any
additional features.

One can easily achieve strong thread safety by combining
boost::shared_ptr and mutex:
boost::shared_ptr<application_settings> g_settings;
mutex_t g_mutex;
All operations involving g_settings (update, acquire) must lock
g_mutex. All operations on acquired object (release) can not lock
g_mutex.

But with atomic operations one can go further and eliminate all atomic
operations and mutation of shared state on fast-path. This way threads
work only with cached data in read-only mode most of the time. This
makes huge performance and scalability difference provided high load.
Basically this means superlinear negative scaling versus linear
positive scaling (I emphasize: provided high load).


Dmitriy V'jukov
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top