To use or not to use smart pointers?

D

Daniel T.

Martijn van Buul said:
* Dennis Jones:

I work in computer vision. Not too long ago, I was rewriting some of our
existing codebase. The algorithms itself were OK, but the implementation was
"C with classes" and no longer up to par.

[Anecdote snipped]

If we are going down that road. I worked with someone who insisted on
writing his own double-linked list code because "there is no way generic
code could be faster". It turned out that std::list was a full 5% faster
than anything he could come up with with full optimizations on.
(However, it was much slower without optimization.) And the program was
much easer to write and understand.
 
M

Michael DOUBEZ

Daniel T. a écrit :
Martijn van Buul said:
* Dennis Jones:
I work in computer vision. Not too long ago, I was rewriting some of our
existing codebase. The algorithms itself were OK, but the implementation was
"C with classes" and no longer up to par.

[Anecdote snipped]

If we are going down that road. I worked with someone who insisted on
writing his own double-linked list code because "there is no way generic
code could be faster". It turned out that std::list was a full 5% faster
than anything he could come up with with full optimizations on.
(However, it was much slower without optimization.) And the program was
much easer to write and understand.

In the specific case of Martijn van Buul, I expect a pool allocator
would have been indicated; that depends on the rate at which elements
were inserted/removed from the list. If it is intensive, calling a heap
allocator each time is a killer.

Michael
 
M

Martijn van Buul

* Michael DOUBEZ:
In the specific case of Martijn van Buul, I expect a pool allocator
would have been indicated; that depends on the rate at which elements
were inserted/removed from the list. If it is intensive, calling a heap
allocator each time is a killer.

I used a pool allocator, and that helped *some*. Just not enough. Sometimes,
you cannot dismiss the overhead of an extra call, or an extra indirection, as
being irrelevant.
 
M

Michael DOUBEZ

Martijn van Buul a écrit :
* Michael DOUBEZ:

I used a pool allocator, and that helped *some*. Just not enough. Sometimes,
you cannot dismiss the overhead of an extra call, or an extra indirection, as
being irrelevant.

I don't see where you would have an extra indirection and the extra call
depends on the compiler (and somewhat the STL implementation used).
If your benchmark showed you not to use STL on your plateform, then you
were right not to but that doesn't mean it is a general flaw.

Michael
 
Y

Yannick Tremblay

* Dennis Jones:

One stood out in particular: A simple data structure used to identify objects
in a bitmap image, consisting of a doubly linked list of begin- and end points
for each row. This particular algorithm identified all connected objects in a
structure like this, and the existing implementation used a pool of these
critters, and was reasonably fast. ~ 200 microseconds per image, on my
computer, with my test set. (Using the same CPU and same compiler (gcc 4.2.0)
for all cases)

I first rewrote it to a std::list<CSegment>, with CSegment being something
like

struct CSegment
{
int mStart;
int mEnd;
int mRow;
};

Performance was abysmal; execution time went up to a staggering 50 ms. That's
more than the total budget I have for the *entire* evaluation process per
image, so it's not even close to acceptable.

I am assuming that you have used an optimized STL. Speed differences
between a non-opimtised STL and an optimised one can huge.

The other point I would have looked at is if the std::list was the
correct container for your application. Although it might be tempting
to replace a linked list with a std::list, quite often, a std::vector
is more appropriate. The speed characteristics of a list vs vecotr
are different and unless you need fast splicing and/or
insertion/delition in the middle, a vector might be better or maybe a
deque.

See for example this article:
http://www.ddj.com/dept/cpp/184401838

Yan
 
T

Tim H

multi-core CPU environment the lock or interlocked increment
operation requires the processor cache to be flushed so that the

Not strictly true - a cache flush is a heavy handed way to handle
atomic ops. Most platforms have better soluttions, such as robust
cache-cohenerncy protocols. It still has a (large) cost, but not as
much as a full cache flush, and usually not even as much as a full
line flush.
change is visible to other cores. This performance hit can be
difficult to detect because no profiler outside of a hardware
logic analyzer can really detect it. You'll need to compare
performance with raw pointer code for the same tasks.

valgrind !!
Despite this peanut butter spread of a performance penalty,
you'd be insane to use raw pointers. If you habitually use raw
pointers you might as well be programming in C. Used
appropriately, auto_ptr<>, shared_ptr<>, weak_ptr<>, and
brethren will eliminate all resource leaks and their attendant
problems such as memory corruption through premature frees. Raw
unwrapped pointers provide no protection. Naked pointers are
obscene.

I can't agree with these superlatives - there is no silver bullet.
Nothing solves every problem. Smart pointers can help solve a lot of
problems, but be realistic, please :)

Tim
 
J

James Kanze

I had a 3 hours meeting today with some fellow programmers
that are partly not convinced about using smart pointers in
C++.

Does that mean that they are convinced that one should never use
them, or that they are convinced that they aren't a silver
bullet, and should only be used where appropriate (which isn't
really all that often).
Their main concern is a possible performance impact.

Obviously, there is some effect, but you have to compare it to
the alternatives, in your application.

The few actual benchmarks I know show the Boehm collector to be
significantly faster than boost::shared_ptr. But as much as I
also favor using the Boehm collector, I do recognize that they
were written by people who strongly favor garbage collection,
and that they intentionally chose scenarios where the difference
would be significant. I suspect that in most applications,
garbage collection will be slightly faster, but the difference
won't be significant.
I've been explaining the advantages of smart pointers
endlessly (which are currently used in all our C++ software;
we use the Boost smart pointers) as I'm seriously concerned
that there is a shift to raw pointers.

That's interesting. I've always found that trying to use them
systematically (rather than just in specific cases) caused a lot
of extra effort for no gain.
We are not developing system software but rather normal
Windows programs (with exceptions turned on). I wouldn't want
to write a C++ program without smart pointers any more but
after that endless discussion I wonder if I'm too strict. Any
serious arguments not to use smart pointers?

Not to use them in the occasional cases where they are
appropriate, none. But that's true for a lot of things.
 
J

James Kanze

There _is_ no performance impact.

I'd be interested in seeing your implementation, then. Hans
Boehm ran a set of benchmarks comparing boost::shared_ptr with
raw pointers and the Boehm collector, and the shared_ptr didn't
come out very well. In general, boost::shared_ptr suffers at
three points:

-- There's an extra allocation when you create the first
pointer, and a free when you free the last.

-- The pointer is twice the size of a raw pointer. Most
significantly, most compilers will return a raw pointer in a
register, but will use an intermediate, temporary variable
when returning a boost::shared_ptr.

-- Copying the pointer requires incrementing a counter. Never
zero overhead, and possibly significant in a multithreaded
environment.
Whoever claims there is, should prove it to you instead of
just saying that.

All they have to do is cite existing literature.
A smart pointer is just a way for a programmer to relax and
not have to remember when to delete the object when the
pointer goes out of scope.

Except that the simplest way to ensure that an object is
correctly destructed and the memory freed when it goes out of
scope is just to declare it as a local variable.
 
J

James Kanze

[...]
(and also should carry the burden of proving the correctness
for the code base resulting from a design decision in favor of
raw pointers).

That is, of course, the key part. Systematic use of shared_ptr,
where not appropriate, makes it very difficult to prove that the
code is correct.
 
J

James Kanze

Victor Bazarov said:
Boris said:
I had a 3 hours meeting today with some fellow programmers that are
partly not convinced about using smart pointers in C++. Their main
concern is a possible performance impact
[...]
There _is_ no performance impact. Whoever claims there is, should
prove it to you instead of just saying that. A smart pointer is
just a way for a programmer to relax and not have to remember when
to delete the object when the pointer goes out of scope. All access
to the actual object is resolved at compile time - no overhead.
And by the way, I verified this experimentally last week while
profiling some code. Using GCC 4.0.4 and Boost 1.33.1, it
looks like there's a very small amount of overhead (like 5%)
in executables built in "debug" mode, e.g. no optimization, no
inlining, etc. This overhead *completely* dissapears when the
optimization levels are turned up.

You really should publish your benchmarks, then, because this is
in contradiction with all published benchmarks. (It's also in
contradiction with common sense, of course, but I'm very
suspicious of common sense when it comes to performance.)

In practice, I doubt that there will be a significant difference
in very many applications, although it should be possible to
create artificial cases that strongly favor one or the other.
The real difference is program correctness---boost::shared_ptr
makes it very difficult to reason about program correctness
except in specific cases.

[...]
C) This is easy to prove, so you don't have to take our word for it;
go do the experiment on your machine, with your compiler.

And code typical for your application.
And by the way, we use smart pointers *everywhere*, and it's probably
saved us *years* of programmer time.

Compared to what? It sounds like a lot of extra work for
nothing. (Actually, it sounds like the claim from people using
Java, that memory leaks are impossible with garbage collection.
Both shared_ptr and garbage collection are very powerful tools
for specific cases---garbage collection is more generally
applicable than shared_ptr---but neither is a silver bullet.)
 
J

James Kanze

Oh my gosh, are you serious? No way. You are absolutely correct. The
benefits of smart pointers FAR outweigh any possible arguments against them.

Except that used without consideration, them make the code less
reliable.
There is no performance impact (that I know of),

Depends on the application. For most applications, the
difference should be acceptable. For complicated graphs,
running in a multithreaded environment, they can double the
runtime, or worse.
and the advantages
(automatic and correct object/resource lifetime management,

You wouldn't be interested in buying this bridge I have to sell,
would you? If you believe that, God help your users.
avoidance of
memory leaks in the presence of exceptions,

In special cases (although I usually use the Boehm collector,
which takes care of the memory at far less run-time cost and
programmer effort).
to name only two) are too
compelling to ignore. Another (strange, but typical) argument made against
smart pointers is their sometimes odd usage syntax (extra typing, ugliness?
I don't know), but in my view, that is a very small price to pay for the
peace of mind and safety afforded by their use.

The major argument against them is that they reduce the safety.
If your collegues convince you otherwise, you should be
working in another field.

If you believe that systematically using boost::shared_ptr, or
whatever, will eliminate all lifetime of object issues, you
should be working in another field.
 
J

James Kanze

Okay, how about when used correctly?

Which means sparingly, then fine. I use my own reference
counted pointer in certain cases (a lot less, of course, since
I discovered the Boehm collector). But those cases are rather
the exceptions. In my experience, most (not all, but most) C++
object fall into two large categories: values and entity
objects. The first should almost never be allocated
dynamically, so the issue doesn't come up, and the second have
very arbitrary lifetimes depending on external events, so
typical smart pointers don't apply. The major use of smart
pointers I've had in the past was for "agents": small
polymorphic objects which, from a logical point of view, should
probably be copied, but because they are polymorphic, can't be.
Since such agents never own any real resources, however (except
the memory they reside in), the use of the Boehm collector has
made the use of smart pointers for them pretty much superfluous.
 
J

James Kanze

If they find an *actual* performance impact, then they have a leg to
stand on, until then their appeal to fear is fallacious.

There are already a significant number of benchmarks which show
a real impact.
Why on earth would you want to write the same pointer management code
over, and over again?

What is there to manage about a pointer? A pointer is just a
means of navigating between objects. (And since navigational
possibilities almost always includes cycles, using smart
pointers, instead of raw pointers, requires additional analysis,
and introduces significant additional possibilities for errors.)

If you're talking about lifetime management, of course, the
reason why you rewrite it each time is because the the only
objects you allocate dynamically in C++ are those which require
explicit, application specific lifetime management, which is
different each time.
That's not reuse... Do these fellows also write a
loop instead of using strcpy? (I've known programmers who do that.)
"The boss will yell at you and make you remove them."

You want your program to work correctly, and not leak memory.
 
J

James Kanze

[...]
Despite this peanut butter spread of a performance penalty,
you'd be insane to use raw pointers.

So you consider smart pointers a silver bullet. Despite the
significant extra effort it takes to use them, and the fact that
they don't solve anything most of the time, but rather introduce
new problems (like managing cycles) which weren't there before
hand.
If you habitually use raw pointers you might as well be
programming in C.

In other words, you don't have any real arguments, so you resort
to name calling.
Used
appropriately, auto_ptr<>, shared_ptr<>, weak_ptr<>, and
brethren will eliminate all resource leaks and their attendant
problems such as memory corruption through premature frees.

So once again, you're claiming a silver bullet. Oh how I wish
it were true. I've worked a lot on applications which run 24
hours a day, 7 days a wekk, where we have contractual penalties
for down time. You can't depend on smart pointers there.
Raw unwrapped pointers provide no protection.

Nor do they provide any added problems when all you want to do
is navigate (and not manage lifetime), or when the lifetime
depends on the application logic.
Naked pointers are obscene.

More name calling, in the absense of any real arguments.
Rather than habitually using shared_ptr<>, consider using the
boost smart containers. Use auto_ptr<> as much as possible to
avoid sharing.

auto_ptr<> plays a very important roll in thread safety. You
use auto_ptr<> in your message passing interfaces, and you're
guaranteed that the sending thread will not try to access the object
when it is in possession of the receiving thread. The
advantages are enough to outweigh the disadvantage of its
preempting lifetime management (and "messages" typically don't
have to have an explicit lifetime anyway).
 
K

Kai-Uwe Bux

James said:
[...]
(and also should carry the burden of proving the correctness
for the code base resulting from a design decision in favor of
raw pointers).

That is, of course, the key part. Systematic use of shared_ptr,
where not appropriate, makes it very difficult to prove that the
code is correct.

Pointers in C++ are heavily overloaded. They offer two main
features at once, namely (a) decoupling life-time from scope
and (b) support for polymorphism. If you want the later, you
are inviting upon you the perrils of the former.

That pointer and pointee have different life-times introduces
a global state into the program and arguing correctness becomes
much more of a hassle: every new() has to match with one and
only one delete() along each path of execution. Exceptions can
divert the execution path at any moment to an unknown location,
and client supplied types in templates make it unpredictable
what will throw and what will be thrown. That makes life very
hard.

A smart pointer like shared_ptr<> also requires global
knowledge in at least two ways: (a) we need to ensure that
no cycles occur and (b) when one relies on side-effects of
a destructor, sometimes one needs to argue that the last
pointer owning a given pointee is going out of scope.

For me, the lesson is that one needs a set of idioms and tools
that allow us to reduce code complexity and restore the ability
to argue correctness of code locally (by looking at, say, 20
_consecutive_ lines of code at a time). Thus, I tend to wrap
pointers within classes. Smart pointers are just readily packaged
idioms that come up often in this context. For these reasons, smart
pointers form an intermediate layer in my library. They are used
to build other components but rarely used directly. Let me
illustrate this with a few examples.


Example 1 [statefull function objects]
=========

The standard algorithms take function objects by value
and (with the notable exception of for_each) are allowed
to copy those object around. Thus, I found something like
this usefull at times:

template < typename Func >
class unary_f_ref
: std::unary_function< typename Func::argument_type,
typename Func::result_type >
{

std::tr1::shared_ptr<Func> the_func;

public:

unary_f_ref ( Func const & f )
: the_func ( new Func ( f ) )
{}

typename Func::result_type
operator() ( typename Func::argument_type arg ) {
return ( (*the_func)( arg ) );
}

Func & get ( void ) {
return ( *the_func );
}

Func const & get ( void ) const {
return ( *the_func );
}

};

Similar trickery can be applied to reduce the cost of
copying for large objects, in which case reference countng
becomes an invisible performance hack.

Note that shared_ptr<> is actually overkill in this context.
It's support for polymorphism, incomplete types, and a custom
deleter is not needed. A simple intrusive reference counting
smart pointer (not working with incomplete types) fits this
bill neatly.



Example 2 [deep copy pointers]
=========

C++ has value semantics. However, variables are not polymorphic.
You have to use a pointer to support polymorphism. Deep copy
pointers come as close as possible to polymorphic variables
as you can get in C++ (if only we could overload the dod-operator).

A particular example is tr1::function<>, which under the hood
is a deep copy pointer that forwards the function interface to
the pointee.



Example 3 [ownership models]
=========

Smart pointers do not manage resources magically for you, but
sometimes their semantics makes enforcing an ownership model
easier. For instance, a shared resource could be realized like
this:


struct A {};

class SharedResource {

// could be polymorphic
struct Descriptor {

Descriptor ( A a )
{}

void close ( void ) {}

// could be virtual
int some_method ( int ) {
return ( 1 );
}

};

std::tr1::shared_ptr< Descriptor > the_ptr;

public:

~SharedResource ( void ) {
assert( ! is_open() );
}

enum mode { ok, busy, error_1, error_2 };

// you can only open a resource if you don't have one
mode open ( A a ) {
if ( ! is_open() ) {
try {
Descriptor * ptr = new Descriptor ( a );
the_ptr.reset( ptr ); // hopefully strongly safe
}
catch( ... ) {
// return whatever is needed
}
return ( ok );
}
return ( busy );
}

// only the last owner can close the resource
bool close ( void ) {
if ( the_ptr.unique() ) {
the_ptr->close(); // may throw
the_ptr.reset();
return ( true );
}
return ( false );
}

// the last owner cannot disown:
bool disown ( void ) {
if ( is_shared() ) {
the_ptr.reset();
return ( true );
}
return ( false );
}

bool is_open ( void ) const {
return ( the_ptr );
}

bool is_shared ( void ) const {
return ( is_open() && ( ! the_ptr.unique() ) );
}


// forward interface
// =================

int some_method ( int arg ) {
assert( is_open() );
the_ptr->some_method( arg );
}

};

Note that the shared_ptr does not show in the interface




On the other hand, there are sometimes reasons to use a
smart pointer directly, although I always feel a little
uneasy. Example:

#include <memory>

class UserCommand {
public:

// ...

virtual
~UserCommand ( void ) {}

};

// magic source of commands.
// in real life, this is polymorphic.
UserCommand * getUserInput ( void ) {
return new UserCommand;
}


class MessageHandler {
public:

// ...

virtual
void operator() ( UserCommand const & cmd,
bool & do_terminate );
// UserCommand is polymorphic

virtual
~MessageHandler ( void ) {}

};


struct MyException {};

void event_loop ( MessageHandler & the_handler ) {
// MessageHandler is polymorphic
try {
bool last_run = false;
do {
// neeed for pointer because of polymorphism:
UserCommand * the_command = getUserInput();
the_handler( *the_command, last_run );
} while ( last_run );
}
catch ( MyException const & e ) {
if ( false ) {
// placeholder for things that can be
// handled here
} else {
throw( e );
}
}
}

The event_loop as written is buggy. If the line

the_handler( *the_command, last_run );

throws, we leak memory. One can use std::auto_ptr<> to
rectify things:

void event_loop ( MessageHandler & the_handler ) {
// MessageHandler is polymorphic
try {
bool last_run = false;
do {
// neeed for pointer because of polymorphism:
std::auto_ptr< UserCommand > the_command ( getUserInput() );
the_handler ( *the_command, last_run );
} while ( last_run );
}
catch ( MyException const & e ) {
if ( false ) {
// placeholder for things that can be
// handled here
} else {
throw( e );
}
}
}



Best

Kai-Uwe Bu
 
R

Roland Pibinger

Pointers in C++ are heavily overloaded. They offer two main
features at once, namely (a) decoupling life-time from scope
and (b) support for polymorphism. If you want the later, you
are inviting upon you the perrils of the former.

Pointers in C++ (and C) have only 2 features: referencing and
de-referencing. Pointers per se have nothing to do with "life-time"
management. OTOH, 'smart pointers' intermingle ultra-lightweight
pointer functionality with heavy, unrelated tasks (like allocating
refernce counters).
 
A

Alp Mestan

Roland said:
Pointers in C++ (and C) have only 2 features: referencing and
de-referencing. Pointers per se have nothing to do with "life-time"
management. OTOH, 'smart pointers' intermingle ultra-lightweight
pointer functionality with heavy, unrelated tasks (like allocating
refernce counters).

So what about RAII ? It's really an important and good C++ technique.
And smart pointers allow you not to take care of memory management,
adding only few additionnal allocations. IMHO, smart pointers are worth
the price, except _maybe_ for real time programs.
 
K

Kai-Uwe Bux

Roland said:
Pointers in C++ (and C) have only 2 features: referencing and
de-referencing. Pointers per se have nothing to do with "life-time"
management.

That depends on what you mean by "per se". What I meant is that pointers are
the core language feature that _creates the possibility_ of explicit
life-time management. By writing

T * p = new T ( some args );

I create a T-object whose life-time is independent of the scope in which the
creating statement occurs and whose life-time will rather be ended by the
corresponding delete statement. Which other mechanism for doing explicit
life-time management (not involving pointers) is there in the core
language?


Best

Kai-Uwe Bux
 
B

Boris

Does that mean that they are convinced that one should never use
them, or that they are convinced that they aren't a silver
bullet, and should only be used where appropriate (which isn't
really all that often).

They don't want to use them at all. Their argument is that performance is
priority #1 ignoring other goals like stability and maintainability or
resource constraints like project time and budget.
[...]
I've been explaining the advantages of smart pointers
endlessly (which are currently used in all our C++ software;
we use the Boost smart pointers) as I'm seriously concerned
that there is a shift to raw pointers.

That's interesting. I've always found that trying to use them
systematically (rather than just in specific cases) caused a lot
of extra effort for no gain.

I admit that I've been using them rather systematically. The less time I
have to spend looking around for memory leaks the more time I can
concentrate on other things or the earlier I can finish a project. Given
that the projects I'm involved in typically suffer under heavy resource
constraints I find it a luxury to evaluate the pointer type every time I
need to use a pointer. As I'm also not always happy with the code quality
of the projects I'm responsible for I assume that making everyone use
smart pointers contributes more to the code quality than giving them a
choice (if you don't trust someone to make a correct choice you don't
trust him either to be able to write correct code with raw pointers). Thus
I wonder if the problems smart pointers might cause (like a possible
performance impact) are not actually negligibile compared to the problems
they solve as developers have more time to concentrate on other things
(the ones who write the code and the others who have to test and maintain
the code later).
Not to use them in the occasional cases where they are
appropriate, none. But that's true for a lot of things.

Absolutely. But I wonder how this should work in practice as in the
projects I've been involved in there are typically more problems to solve
and details to take care of than resources available. And I don't think my
situation is exceptional. Your situation might be different (looking at to
your sig I wonder if you are a consultant :). That said my experience so
far is that in practice smart pointers are a tool you automatically use
all the time if you aim for high code quality under heavy resource
constraints. As I might have been mislead I asked for opinions in this
newsgroup for a reality check (thanks for all your opinions so far!).

Boris
 
R

Roland Pibinger

So what about RAII ? It's really an important and good C++ technique.

Yes, but RAII doesn't include the creation of pointer-like classes by
overloading operator* and operator->. RAII means that "allocation and
deallocation disappear from the surface level of your code"
(http://www.artima.com/intv/modern3.html).
And smart pointers allow you not to take care of memory management,
adding only few additionnal allocations.

The only potential use case for 'smart pointers' is when a pointer to
a dynamically allocated object shall be returned from a function. If
you avoid returning dynamically allocated objects (for design
considerations) the desire for 'smart pointers' vanishes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top