Raw pointers not evil after all?

M

mike3

Hi.

I saw this:

http://stackoverflow.com/questions/...ything-and-forget-about-classic-normal-pointe

--
"You should use smart pointers careful. They are not the silver bullet
when considering memory management. Circular references are still an
issue.

When making the class design, always think who has the ownership of an
object (has the responsibility to destroy that object). Complement
that with smart pointers, if necessary, but don't forget about the
ownership."
--

And another thing that once seemed clear is now not so clear. What is
"careful" -- what is one to watch out for?

And does this mean that in many cases (it says "smart pointers, *if
necessary*", suggesting there's a good chance they aren't
"necessary"), the "owning object" should use a raw pointer? Yet I've
heard raw pointers are "evil".

So what is it?

And then I see this:
http://stackoverflow.com/questions/417481/pointers-smart-pointers-or-shared-pointers

--
"Although "never" isn't entirely accurate. If you're implementing a
smart pointer you need to use raw pointers. Outside of that special
case you're much better off using the smart versions." (a comment on
one of the answers)
--

"much better off using the smart versions"... whereas the "use them
carefully" to me implies "sparingly". Or am I wrong, and they should
be used frequently, but the care is in _how_ you use them, not _how
many_ or _how often_?

Consider this illustration snippet:

class Owner {
private:
std::vector<Owned *> owns;
public:
...
~Owner();
void addSomething(Owned *);
};

....

~Owner()
{
for(std::vector<Owner *>::iterator it(owns.begin()); it !=
owns.end(); ++it)
delete (*it);
}

void Owner::addSomething(Owned *smth)
{
owns.push_back(smth);
}

void f()
{
Owner oneOwner, twoOwners;
Owned *smthToOwn = new Owned;

oneOwner.addSomething(smthToOwn); // BUT WAIT! The pointer to
smthToOwn is still in f()!
twoOwners.addSomething(smthToOwn); // FATAL!
}

What I notice is this:
1. Only one Owner can own a given Owned. Yet with a raw pointer,
there's nothing to stop us from putting it into two Owners, causing a
fatal crash when both are destroyed. Now, one could say "well then
just don't do that", but shouldn't that be enforced by code and not
just the user's understanding?

2. This concern suggests to use auto_ptr or something like it. But
auto_ptr is attacked on the same website. And technically, the
auto_ptr owns the object, not the Owner. It seems the only way Owner
can directly own the object is with an _EVIL_ raw pointer.

So what to do?
 
R

Rui Maciel

mike3 said:
What I notice is this:
1. Only one Owner can own a given Owned. Yet with a raw pointer,
there's nothing to stop us

That's a good thing. The programmer is free to do what he wishes to do, and
is responsible for its outcome.

from putting it into two Owners, causing a
fatal crash when both are destroyed. Now, one could say "well then
just don't do that", but shouldn't that be enforced by code and not
just the user's understanding?

Not necessarily. Some people might prefer to ride a bike with side-wheels
on, others might find them unnecessary or even a hindrance. Of course, this
might lead to a few bumps and bruises along the way, but the side-wheels
also don't avoid that.

A good thing to keep in mind is that there is a significant amount of
software written in C, and with C the programmer is required to rely on his
understanding for essentially everything. And yet, software written in C
manages to work, and even work well.

2. This concern suggests to use auto_ptr or something like it. But
auto_ptr is attacked on the same website. And technically, the
auto_ptr owns the object, not the Owner. It seems the only way Owner
can directly own the object is with an _EVIL_ raw pointer.

For each technical opinion, you will get twice as much criticising directed
at it.

But that doesn't matter at all. The only thing that matters is that someone
knows what he's doing and is able to justify the technical decisions made
along the way. While others write critical comments on a public forum,
stuff is getting done elsewhere.


Rui Maciel
 
M

mike3

That's a good thing.  The programmer is free to do what he wishes to do, and
is responsible for its outcome.


Not necessarily.  Some people might prefer to ride a bike with side-wheels
on, others might find them unnecessary or even a hindrance.  Of course,this
might lead to a few bumps and bruises along the way, but the side-wheels
also don't avoid that.

A good thing to keep in mind is that there is a significant amount of
software written in C, and with C the programmer is required to rely on his
understanding for essentially everything.  And yet, software written inC
manages to work, and even work well.

So then you _can_ use a raw pointer. But which one would be _better_
in this
circumstance? This is not C, after all, where you _have to_ use raw
pointers
simply because you have no other choice.
For each technical opinion, you will get twice as much criticising directed
at it.

But that doesn't matter at all.  The only thing that matters is that someone
knows what he's doing and is able to justify the technical decisions made
along the way.  While others write critical comments on a public forum,
stuff is getting done elsewhere.

Good points.
 
B

Balog Pal

A good thing to keep in mind is that there is a significant amount of
software written in C, and with C the programmer is required to rely on his
understanding for essentially everything. And yet, software written in C
manages to work, and even work well.

<sigh> yes, for some very diluted meaning of "well" that allows
thousands of bug reports in the tracker and magnitude more unreported...

I thought we're well beyond the point where arguments for "trust the
programmer" are accepted. And all systems with any accepted quality use
a big deal of checking, including language builtins and external tools.


my main problem with C is that it is IME impossible to make the code
correct AND readable at the same time, so those choosing the "working"
version are terrible to read.
For each technical opinion, you will get twice as much criticising directed
at it.

With too many words instead of cutting it short with a general link to
"no silver bullets".
 
R

Rui Maciel

Balog said:
<sigh> yes, for some very diluted meaning of "well" that allows
thousands of bug reports in the tracker and magnitude more unreported...

If you remove all aspects of a programming language and all programming
techniques which might be involved in introducing bugs, you will end up with
no language at all.

I thought we're well beyond the point where arguments for "trust the
programmer" are accepted. And all systems with any accepted quality use
a big deal of checking, including language builtins and external tools.

Easy on the strawman, there. Somehow, you've became so confused to the
point you've mistook accepting the idea that pointers might be used in some
circumstances with not performing any checks, even with external tools.

my main problem with C is that it is IME impossible to make the code
correct AND readable at the same time, so those choosing the "working"
version are terrible to read.

So, you have a hard time reading code. Why should that stop others from
using pointers?

With too many words instead of cutting it short with a general link to
"no silver bullets".

Not quite. The "no silver bullet" term refers to a solution which is both
straight-forward and completely effective. I was referring to solutions
which aren't guaranteed to be either straight-forward or completelly
effective. While some may spend their time arguing over the degree of
academic purity of some techniques, it is always better to actually get
stuff to work in the real world.


Rui Maciel
 
B

Balog Pal

Easy on the strawman, there. Somehow, you've became so confused to the
point you've mistook accepting the idea that pointers might be used in some
circumstances with not performing any checks, even with external tools.

I meant that as a completely general point, not tied to pointers, smart
or dumb.
So, you have a hard time reading code. Why should that stop others from
using pointers?

Sorry for the confusion. :) I didn't dive in the pointer-related
discussion as it looked missing right the first step and assumed some
dichotomy. I only reflected on a comment part that looked like general
and meant to apply in general.
Not quite. The "no silver bullet" term refers to a solution which is both
straight-forward and completely effective. I was referring to solutions
which aren't guaranteed to be either straight-forward or completelly
effective. While some may spend their time arguing over the degree of
academic purity of some techniques, it is always better to actually get
stuff to work in the real world.

Yeah. And IRL thinking in abstract good/bad terms never worked well --
instead a good programmer thinks alternatives and compare them against
each other, drop out some, and pick something from the remaining set
(that so often looks like rock/paper/scissors). For berst results you do
the selection at the latest moment, keeping all the viable candidates --
that is why "no dumb pointers" oe similar generalisations hurt.

OTOH we can observe many cases of bad decisions, and using dumb pointers
for no good reason. And hand-crafting a ton of code in the environment
that is identical to the things the smart pointer would do, delta the
new errors injected. Or the other kind picking a smart pointer from some
existing source not bothering with its requirement (like std::auto_ptr
with incomplete type) or misfit of the original purpose (many uses of
shared_ptr could apply.)

The moral is, as usual that generalized guidelines will never replace
actual thinking and engineering to solve a particular RL problem.
 
M

mike3

On 5/1/2013 7:36 PM, Rui Maciel wrote:

Yeah. And IRL thinking in abstract good/bad terms never worked well --
instead a good programmer thinks alternatives and compare them against
each other, drop out some, and pick something from the remaining set
(that so often looks like rock/paper/scissors). For berst results you do
the selection at the latest moment, keeping all the viable candidates --
that is why "no dumb pointers" oe similar generalisations hurt.

OTOH we can observe many cases of bad decisions, and using dumb pointers
for no good reason. And hand-crafting a ton of code in the environment
that is identical to the things the smart pointer would do, delta the
new errors injected. Or the other kind picking a smart pointer from some
existing source not bothering with its requirement (like std::auto_ptr
with incomplete type) or misfit of the original purpose (many uses of
shared_ptr could apply.)

The moral is, as usual that generalized guidelines will never replace
actual thinking and engineering to solve a particular RL problem.

So what kind of factors should one consider when making the decision
to
use whichever kind of pointer? What should one weigh?

Also, what do you think about this with smart pointers?:

// f should not own whatever smth points to
void f(T *smth) // f() takes a simple pointer
{
...
}

void g() // or could be a member function of some class
{
smart_ptr<T> smartPointer(...) // owns an object of type T
// could instead be a class member

...

f(smartPointer.get()); // doesn't transfer ownership to f

...
}

Is this OK? Or does it violate the intended usage of the smart
pointer?
Like if smart_ptr is an auto_ptr or C++11 "unique_ptr" or something
else
that works like those. Isn't it supposed to maintain a _unique_
reference
to the object it points to? So then you make and pass a raw pointer...
now there's two pointers to that object! Yet we don't want f to own
the
object.
 
C

Christopher Pisz

Hi.

I saw this:

http://stackoverflow.com/questions/...ything-and-forget-about-classic-normal-pointe

--
"You should use smart pointers careful. They are not the silver bullet
when considering memory management. Circular references are still an
issue.

When making the class design, always think who has the ownership of an
object (has the responsibility to destroy that object). Complement
that with smart pointers, if necessary, but don't forget about the
ownership."
--

And another thing that once seemed clear is now not so clear. What is
"careful" -- what is one to watch out for?

And does this mean that in many cases (it says "smart pointers, *if
necessary*", suggesting there's a good chance they aren't
"necessary"), the "owning object" should use a raw pointer? Yet I've
heard raw pointers are "evil".

So what is it?

And then I see this:
http://stackoverflow.com/questions/417481/pointers-smart-pointers-or-shared-pointers

--
"Although "never" isn't entirely accurate. If you're implementing a
smart pointer you need to use raw pointers. Outside of that special
case you're much better off using the smart versions." (a comment on
one of the answers)
--

"much better off using the smart versions"... whereas the "use them
carefully" to me implies "sparingly". Or am I wrong, and they should
be used frequently, but the care is in _how_ you use them, not _how
many_ or _how often_?

Consider this illustration snippet:

class Owner {
private:
std::vector<Owned *> owns;
public:
...
~Owner();
void addSomething(Owned *);
};

...

~Owner()
{
for(std::vector<Owner *>::iterator it(owns.begin()); it !=
owns.end(); ++it)
delete (*it);
}

void Owner::addSomething(Owned *smth)
{
owns.push_back(smth);
}

void f()
{
Owner oneOwner, twoOwners;
Owned *smthToOwn = new Owned;

oneOwner.addSomething(smthToOwn); // BUT WAIT! The pointer to
smthToOwn is still in f()!
twoOwners.addSomething(smthToOwn); // FATAL!
}

What I notice is this:
1. Only one Owner can own a given Owned. Yet with a raw pointer,
there's nothing to stop us from putting it into two Owners, causing a
fatal crash when both are destroyed. Now, one could say "well then
just don't do that", but shouldn't that be enforced by code and not
just the user's understanding?

2. This concern suggests to use auto_ptr or something like it. But
auto_ptr is attacked on the same website. And technically, the
auto_ptr owns the object, not the Owner. It seems the only way Owner
can directly own the object is with an _EVIL_ raw pointer.

So what to do?


Be a good programmer and use what is appropriate for the task rather
than trying to form a general rule and expect magical results. I use raw
pointers, shared_ptrs, weak_ptrs, and auto_ptrs in my code.

My experience has been that teams that use raw pointers entirely have
memory leak bugs, teams that use smart pointers entirely have memory
leaks due to cyclical references. Many software houses have adopted an
very ignorant philosophy of "If we make everything a shared pointer,
than nothing will leak!" and it is completely false.

Use RAII. Try your best to make sure anything that is allocated is
"owned". Something can indeed be owned AND shared. You should never find
yourself debugging code and left wondering "Who allocated this? Who is
releasing it?", "When is it getting release?" It should be obvious by
the design, not something you have to spend a day reverse engineering.

None of these are "evil", the programmer whom misuses them is evil.


I won't go too in depth with examples, but I will clarify auto_ptr, in
that the situation where I use auto_ptr is when I have a method that
_must_ allocate its result to be returned to the caller. Such a
situation is where I have a method

ISomeAbstractInterface DoSomething(x,y,z);

You have no choice, AFAIK, but to new MyConcreteInterface and return it.
You don't want to return a raw pointer, because no one will own it and
it isn't obvious to the caller that you allocated something that needs
to be released. Thus you return an auto_ptr, which forces the caller to
take ownership, because the returned object is going to be released when
it goes out of scope.
 
C

Christopher Pisz

P.S
In your example above void f() allocated the object and therefore should
own it. Design problem. Write Owner in such a manner that it allocates
the object and owns it.

Owner::Add(params){...//allocates and stores}

const Owned & Owner::Get(params){...//returns a const reference or
reference to the stored object}

Better yet, since you are just wrapping a std::vector, implement
Owner::iterator and Owner::const_iterator. Look at what the vector
itself does. Does it return a pointer? Nope. Does it own its elements?
Yep. Can users get and modify the contained object? Yep. How does it do
that? :)
 
M

mike3

On 5/1/2013 2:19 AM, mike3 wrote:
Be a good programmer and use what is appropriate for the task rather
than trying to form a general rule and expect magical results. I use raw
pointers, shared_ptrs, weak_ptrs, and auto_ptrs in my code.

So how should one determine which kind of pointer is appropriate for
which
kind of task? For what kinds of tasks are raw pointers appropriate?
shared_ptr?
weak_ptr? auto_ptr?
 
M

mike3

On 5/1/2013 3:29 PM, Christopher Pisz wrote:
P.S
In your example above void f() allocated the object and therefore should
own it. Design problem. Write Owner in such a manner that it allocates
the object and owns it.

So then the "owner" should be the same as the thing that allocates it?
Owner::Add(params){...//allocates and stores}

const Owned & Owner::Get(params){...//returns a const reference or
reference to the stored object}

But what should Owner's internal vector contain? Raw pointers or some
kind of smart pointer?
Better yet, since you are just wrapping a std::vector, implement
Owner::iterator and Owner::const_iterator. Look at what the vector
itself does. Does it return a pointer? Nope. Does it own its elements?
Yep. Can users get and modify the contained object? Yep. How does it do
that? :)

But with a real life object which might be more than just a "wrapped
vector", might such an interface be inappropriate? I was just using
that as an example...
 
C

Christopher Pisz

So then the "owner" should be the same as the thing that allocates it?


But what should Owner's internal vector contain? Raw pointers or some
kind of smart pointer?


But with a real life object which might be more than just a "wrapped
vector", might such an interface be inappropriate? I was just using
that as an example...


In my opinion yes, The owner should be the class that allocated the
object. This isn't always possible, but more often that not, it is
possible and should be designed that way. Google "Resource Allocation Is
Initialization" if you haven't already. What you will find is related.

In your example, you don't really describe what kinds of things users of
Owner would like to do. Assuming that they just want to look at the
object contained and read or modify it's values, I would personally
implement that vector to contain raw pointers to allocated Owned
objects, which get allocated on a call to Add. Make sure that the vector
is emptied and each object is released as it is emptied in the
Deconstructor and make sure you handle release in a Remove method if you
implement one. You may implement a Get that returns a reference to the
allocated object by dereferencing the pointer. In this manner, you are
using raw pointers, but no one outside your class is going to use or see
a pointer at all. It is all encapsulated within that class. This is all
assuming of course that you are not considering thread safety, but let's
keep it simple. I've done exactly this in production code all the time.

That is _if_ you expect the cost of copy constructing Owned and passing
it around by value to be greater than allocating and releasing it. Owned
might be so simple that you can forget using pointers and allocation at
all, just pass it around by value. Of course performance measurement and
microoptimization is another topic altogether! :)

-----
shared pointer is harder to explain and someone else might do it better,
but generally I've used it in cases where I want to make sure something
can be shared outside of my class. Shared is different than looked at. I
suppose shared might mean another class might actually use that shared
pointer as part of its state, or in other words, needs it for part or
all of its lifetime. If it just needs to get looked at, I use a
reference if possible.

I do not want to give someone a raw pointer outside my class if I can
help it. That would allow them to delete it or I might throw an
exception and the pointer is bad because I destroyed the object when I
got destroyed. In the other example we used raw pointers _inside_ the class.

When you do this though, you do indeed have to be very careful and look
at everything's lifetime, make sure you do not create cyclical
references, or if you can't help it, than use a weak_ptr where appropriate.

Perhaps a DatabaseConnectionPool hands out a connection that is to be
reused, to different classes that actually query for data for example.
They need a connection in order to perform queries perhaps for their
entire lifetime. The pool is responsible for actually allocating and
cleaning up the connections (even if that means forcing close on them at
destruction), while the other classes just use it to do their queries.

This is all my personal opinion. i don't think anyone has actually
called me out on when to use which option before.
 
S

Stefan Ram

Christopher Pisz said:
The owner should be the class that allocated the object.

One can have several different instances of the same class
at run-time each of which can own /different/ entities.

This shows that the entities which can own at run-time are
instances, not classes. Classes are the source-code
(blueprint) for instances.

Unless, of course, one does procedural non-OOP programming
where classes are used just with static members. But the
use of the word »object« above indicates that this was
related to OOP - altough, in C++, an object actually is
just a region of memory, such as an int object, which is
not an instance of a class.

When one hears some people talk about OOP, one could be lead
to believe that this was »class-oriented programming«! All
they see in the source code are classes. There are no
objects in the source code! So they start to believe that it
was all about classes. They still need to develop a correct
mental model of the run-time world of an OOP program, which
differs from the source-code world.
 
N

Nobody

So how should one determine which kind of pointer is appropriate for
which kind of task? For what kinds of tasks are raw pointers appropriate?
shared_ptr? weak_ptr? auto_ptr?

Raw pointers for transient references.

auto_ptr or unique_ptr where ownership is clear-cut.

shared_ptr where ownership is not clear-cut, but there can be no circular
references.

weak_ptr where you need to avoid circular references and the pointer is
not essential.

In all cases, you need to be aware of which operations can result in
destruction (blindly using shared_ptr for even the most transient
reference isn't a practical solution).

For anything which doesn't fit the above, you need garbage collection.
 
J

James Kanze

On Thursday, 2 May 2013 02:03:00 UTC+1, Christopher Pisz wrote:

[...]
The owner should be the class that allocated the
object.

If the owner should be the class which allocated it, why bother
with dynamic allocation. There are some exceptions, when
a class implements a dynamically sized structure (like
`std::vector`), but most of the time, the reason you're
allocating dynamically is that the object has an arbitrary
lifetime of its own, which may excede that of the object which
creates it.

And why this insistence on ownership. In many application, the
most important objects aren't (or shouldn't be) owned by anyone.
The object itself manages its lifetime, in response to whatever
external events it waits one. (Who "owns" the window object in
a GUI application?) Or the objects are created by some sort of
factory (which certainly doesn't "own" them).
 
J

James Kanze

So how should one determine which kind of pointer is
appropriate for which kind of task? For what kinds of tasks
are raw pointers appropriate? shared_ptr? weak_ptr?
auto_ptr?

Raw pointers are the default. They probably represent 90% or
more of the pointers in a normal application. The various smart
pointers fulfil special needs: a typical example is when an
object is logically constructed in several steps: you hold it in
an auto_ptr until the construction is finished, and it is ready
to be handles however fully constructed objects are handled.
If it's an entity object responding to external events, for
example, you would hold it in an auto_ptr until it has been able
to register for those events (and if it can register for those
events in the constructor, you probably don't need any smart
pointer for it ever).

I might add that if you need weak_ptr, you're using shared_ptr
in a context where it isn't appropriate.
 
S

Stefan Ram

James Kanze said:
And why this insistence on ownership.

I think this is about reference-counting: The object counts
its owners and disposes itself when there is no owner left.

Ownership is created by holding a reference to the object
and intending to possibly dereference this in the future.
»Reference« not in the sense of the C++ reference types,
but in the sense of a C reference.

Thus, when a reference is copied, there is one more owner,
++refcount, when an ownership is moved, the number of owners
does not change, when an owner does not want to use his
reference any more, it is decremented: --refcount, and
dispose self if refcount < 1 now.
 
J

James Kanze

Raw pointers for transient references.
auto_ptr or unique_ptr where ownership is clear-cut.

Most of my uses of auto_ptr are for "temporary" ownership. The
code ends with a call of release on the auto_ptr. Except, of
course, if there is an exception (which would prevent the code
from executing to its term, at which point, the object is
managed otherwise, often by itself).

Another case where auto_ptr is appropriate is when you don't
want to be able to access the object once you've passed it on.
I use them in the interface of my interthread queues: once the
object has been posted to another thread, it cannot be accessed
by the initial thread.
shared_ptr where ownership is not clear-cut, but there can be no circular
references.
weak_ptr where you need to avoid circular references and the pointer is
not essential.

In which case, shared_ptr probably isn't the appropriate
solution.
In all cases, you need to be aware of which operations can result in
destruction (blindly using shared_ptr for even the most transient
reference isn't a practical solution).
For anything which doesn't fit the above, you need garbage collection.

Garbage collection doesn't really solve the same problems
(although some people do try to use shared_ptr as a garbage
collector). Garbage collection doesn't address object lifetime
issues, and if programmers never made mistakes, it probably
wouldn't be necessary in a C++ context (where value semantics
rule, and the only dynamically allocated objects are those which
application determined lifetimes). Since programmers aren't
perfect, however, robust applications do need garbage
collection. In order to be able to detect use of "dead"
objects: if you delete immediately, and the memory is recycled,
any use of a remaining stray pointers to the object will result
in who knows what. Garbage collection ensures that the memory
won't be recycled as long as it is reachable, so you can set an
identifiable state, check for it, and handle the error
gracefully (which may mean an assertion failure---but cannot
result in the program doing other dammage).
 
S

Stefan Ram

James Kanze said:
collector). Garbage collection doesn't address object lifetime
issues, and if programmers never made mistakes, it probably
wouldn't be necessary in a C++ context (where value semantics

One might implement a graph editor in C++, where there is a
person (the user of the software) who can add or remove
other persons he likes. He also might add like-relations
beetween arbitrary two persons (»A likes B«, »B likes A«, »B
likes C«), but when there is no »likes-path« anymore from
the user to a persons, the person can be deleted from the graph.
Due to cycles, reference counting is difficult here, so one
effectively needs to write a subroutine to find whether
after a modification to the graph a person is still reachable
from the user via a path of likes, i.e., a garbage collector.
 
L

Larry Evans

One might implement a graph editor in C++, where there is a
person (the user of the software) who can add or remove
other persons he likes. He also might add like-relations
beetween arbitrary two persons (»A likes B«, »B likes A«, »B
likes C«), but when there is no »likes-path« anymore from
the user to a persons, the person can be deleted from the graph.
Due to cycles, reference counting is difficult here, so one
effectively needs to write a subroutine to find whether
after a modification to the graph a person is still reachable
from the user via a path of likes, i.e., a garbage collector.
Also, in this use case, I don't think the combination of
shared_ptr and weak_ptr would solve the problem because
there's no easy way to detect when a cycle is created.
I guess you could, whenever a smart pointer is created,
traverse the graph to detect whether it creates a cycle,
and if it does, then create a weak_ptr, if not, create
a shared_ptr. However, that seems pretty expensive.

-regards,
Larry
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top