Questionable advice

W

woodbrian77

Sorry, ignore those results. I measured the wrong thing.
I'm retesting and plan to post again in 12 or so hours.


unique_ptr results:

user .259s
sys 2.688s

user .315s
sys 2.688s

user .266s
sys 2.728s

adding those and dividing by 3 yields 2.98s.

raw pointer:

user .259s
sys 2.713s

user .303s
sys 2.689s

user .292s
sys 2.722s

adding and dividing by 3 yields 2.99s

2.99/2.98 is 1.0033. So the raw pointer approach
is 1.0033 times slower than the unique_ptr approach.
This is again on machine that is over 98% idle when
the tests aren't being run.
 
Ö

Öö Tiib

So the raw pointer approach
is 1.0033 times slower than the unique_ptr approach.

That is 0.3%. Even 5% is irrelevant since none of your
clients care if it is 1 minute or 57 seconds.
 
W

woodbrian77

That is 0.3%. Even 5% is irrelevant since none of your
clients care if it is 1 minute or 57 seconds.

I have this now:

auto request=pendingTransactions.front();

...

::delete request;
pendingTransactions.pop_front();

Some years ago I remember there was discussion here about
undefined behavior related to something like that.
Am wondering how this compares:

auto request=pendingTransactions.front();
...

pendingTransactions.pop_front();
::delete request;
 
I

Ian Collins

I have this now:

auto request=pendingTransactions.front();

...

::delete request;
pendingTransactions.pop_front();

Some years ago I remember there was discussion here about
undefined behavior related to something like that.
Am wondering how this compares:

auto request=pendingTransactions.front();
...

pendingTransactions.pop_front();
::delete request;

If you use a managed pointer, you don't have to worry. You also avoid
the questionable "::delete".
 
Ö

Öö Tiib

"Never use raw pointers for ownership" as a hard rule would actually
create a vicious cycle, if interpreted literally, because the smart
pointer *is* using a raw pointer for ownership (which in itself would
have to use another smart pointer, which would have to use another
smart pointer, ad infinitum.)

The point probably was that current set of smart pointers is good
enough to represent all needs for ownership so mixing raw and smart
is like mixing 'char*' and 'std::string'.

There can be other sets of smart pointers with slightly different
ideology but then using the mix is like usage of 'std::wstring'
in mix with 'QString', 'CString', 'wxString' and what else there
are.
 
W

woodbrian77

If you are using 'new', then I don't think you are going to be
instantiating the class millions of times per second, so zeroing
a buffer isn't going to add much to the already-existing overhead.
(Well, unless you expect the buffer to be several megabytes in size.)


The instances of the classes that do the new's stay around
for a long time, but the data in them gets flushed (erased
when I was using a vector) and then the buffer is used again
with more resizes. At first I tried sticking to vector and
using reserves rather than resizes, but I couldn't get away
with it. If you use reserve and then copy some data into
a vector and then later try to use insert and tell it to
add the new data at the end, the vector overwrites the
first data put into it because you didn't call resize on it.

Others have made some similar decisions:
https://groups.google.com/forum/#!topic/comp.lang.c++.moderated/eOudo37vc98
 
R

red floyd

The point probably was that current set of smart pointers is good
enough to represent all needs for ownership so mixing raw and smart
is like mixing 'char*' and 'std::string'.

Do current (C++11) smart pointers handle inheritance hierarchies
properly?

i.e.:

class Base { virtual ~Base(); };
class Derived : public Base { };

shared_ptr<Base> bp;
shared_ptr<Derived> dp;

void f(shared_ptr<Base> p);
....

f(bp);
f(dp); // is this OK?
 
I

Ike Naar

I watched some of the videos from the C++ conference at
Microsoft. Some of the advice given was to not use raw
pointers for ownership. I think that's going too far.

http://webEbenezer.net/misc/SendBuffer.hh
http://webEbenezer.net/misc/SendBuffer.cc

Someone may say that that doesn't scale if we need to
add another pointer to the class. I disagree. Classes
can have at least one pointer that isn't wrapped with
a smart pointer.

http://webEbenezer.net/misc/SendBufferCompressed.hh
http://webEbenezer.net/misc/SendBufferCompressed.cc

Finally, at the conference STL pointed out that using a
delegated constructor you can safely have multiple raw
pointers in a class:

http://webEbenezer.net/misc/Compressed.hh
http://webEbenezer.net/misc/Compressed.cc

Previously I was using unique_ptr more than now, but
have found in each of these cases that the code that
results without using it so much is a little smaller.
I've seen a lot of overuse of shared_ptr over the years
and now think it's also possible to use unique_ptr too
much.

The simplified advice came from Bjarne. I think for
student programs it doesn't make much difference, but
he may be focussing too much on beginners these days.
[...]
Ebenezer Enterprises - In G-d we trust.

Don't trust us.
Ask G minus d.
 
W

woodbrian77

Who cares about the size of the executable? What matters is, firstly,
its safety, and secondly, its efficiency. File size is mostly
inconsequential.

Jim Radigan and I do. :)
http://channel9.msdn.com/Events/GoingNative/2013/Compilerpp-

He talks about code size, safety, stack packing and says it's
all about cache lines.

I've said in the past that I don't have as strong a
datacenter as a lot of companies do. My datacenter is
orders of magnitude less expensive than a lot of other
datacenters. Experimentation is probably more important
to me than it is to other companies.

(Also, are you sure you are compiling will full optimizations and
stripping debug info? Because if you are compiling in debug mode
then comparing file sizes is absolutely moot.)

I have -O3 in the CXXFLAGS for all of the builds and use -s
also in the makefile to strip executables. I agree that
comparing file sizes would be pointless if that weren't
the case.
 
W

woodbrian77

Don't trust us.
Ask G minus d.

I want to "chew the meat and spit the bones."

BEN ZOMA SAYS: Who is wise? – He who learns from every man.

Maybe I'm a little like the mathematician Mobius. I think
he's been described as a strong integrator -- pulling
together ideas from different people to do something
interesting.

"However Möbius did not receive quick promotion to full
professor. It would appear that he was not a particularly
good lecturer and this made his life difficult since he did
not attract fee paying students to his lectures. He was
forced to advertise his lecture courses as being free of
charge before students thought his courses worth taking."

That sounds familiar too.
 
T

Tobias Müller

I have -O3 in the CXXFLAGS for all of the builds and use -s
also in the makefile to strip executables. I agree that
comparing file sizes would be pointless if that weren't
the case.

That doesn't make sense. If you care so much about file size you should not
use O3, since O3 explicitly trades file size for speed.

From the GCC homepage
(http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html):

-O2: [...] GCC performs nearly all supported optimizations that do not
involve a space-speed tradeoff [...]

-O3: Optimize yet more.

-Os: [...] enables all -O2 optimizations that do not typically increase
code size. It also performs further optimizations designed to reduce code
size.

So, if you care mostly about file size, you should use -Os. If you care
about speed _and_ file size, you should probably choose -O2.

Tobi
 
J

Jorgen Grahn

I have -O3 in the CXXFLAGS for all of the builds and use -s
also in the makefile to strip executables. I agree that
comparing file sizes would be pointless if that weren't
the case.

That doesn't make sense. If you care so much about file size you should not
use O3, since O3 explicitly trades file size for speed.

From the GCC homepage
(http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html):

-O2: [...] GCC performs nearly all supported optimizations that do not
involve a space-speed tradeoff [...]

-O3: Optimize yet more.

-Os: [...] enables all -O2 optimizations that do not typically increase
code size. It also performs further optimizations designed to reduce code
size.

So, if you care mostly about file size, you should use -Os. If you care
about speed _and_ file size, you should probably choose -O2.

I think the best rule of thumb is what I've read here in the past:

If you're not willing to spend time comparing and measuring, just use
-O2. Otherwise, try -O2, -O3 and -Os and decide which result you like
best.

I've almost never seen spectacular speed differences.

/Jorgen
 
V

Victor Bazarov

A problem is that while users are pretty good at picking out problems
or potential improvements in their current workflow, which will supply
you with many possible incremental improvements, but they're pretty
bad at coming up with major new features.

OK. And how is this relevant?

Implementation details (which usually result in performance vs size of
the executable, etc.) are hardly ever _features_. It's a topic for
comp.software-eng, of course, but the two extremes in approaches to
software improvements are (a) to add more [shiny] features at cost to
overall usability and performance of the system and (b) make changes to
program performance, size, etc., without any change to [already
accepted] functionality; and both are common mistakes, IMNSHO. Balance
between the two is not easy to maintain, but those who can, gain the
market share (provided there is competition in the field of application).

Also, let me repeat, the slippery slope is in judging (or pretending to)
what it is that customers "need" and making it stand above what users
actually want (which is often expressed). Ignoring the customer
requests for the sake of providing them with what they "need" leads to
isolation and loss of market share, and there are plenty of examples.

Talk to customers and offer them the choice between adding more features
[to existing workflows], or changing the paradigm, or squeezing another
5% out of the text segment or 0.5% of execution time.

"True wisdom is less presuming than folly. The wise man doubts often,
and changes his mind; the fool is obstinate, and doubts not; he knows
all things but his own ignorance." ;-)

V
 
W

woodbrian77

That doesn't make sense. If you care so much about file size you should not
use O3, since O3 explicitly trades file size for speed.

From the GCC homepage

(http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html):

-O2: [...] GCC performs nearly all supported optimizations that do not
involve a space-speed tradeoff [...]

-O3: Optimize yet more.

-Os: [...] enables all -O2 optimizations that do not typically increase
code size. It also performs further optimizations designed to reduce code
size.

So, if you care mostly about file size, you should use -Os. If you care
about speed _and_ file size, you should probably choose -O2.

I did 5 tests between O2 and O3 on gcc. In 4 of them the
O2 built executable beat the executable built with O3. In
the other test they tied.

O2 O3
2.963s < 2.98s
2.99s < 3.009s
2.99s < 3.022s
3.02s < 3.029s
2.969s == 2.969s

These seem like unimpressive figures for O3 especially
given that the machine is over 98% idle when the tests
aren't running.

I'm going to switch back to O2.
 
I

Ian Collins

That doesn't make sense. If you care so much about file size you should not
use O3, since O3 explicitly trades file size for speed.

From the GCC homepage

(http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html):

-O2: [...] GCC performs nearly all supported optimizations that do not
involve a space-speed tradeoff [...]

-O3: Optimize yet more.

-Os: [...] enables all -O2 optimizations that do not typically increase
code size. It also performs further optimizations designed to reduce code
size.

So, if you care mostly about file size, you should use -Os. If you care
about speed _and_ file size, you should probably choose -O2.

I did 5 tests between O2 and O3 on gcc. In 4 of them the
O2 built executable beat the executable built with O3. In
the other test they tied.

O2 O3
2.963s < 2.98s
2.99s < 3.009s
2.99s < 3.022s
3.02s < 3.029s
2.969s == 2.969s

These seem like unimpressive figures for O3 especially
given that the machine is over 98% idle when the tests
aren't running.

The load on the system is largely irrelevant. The law of diminishing
returns comes into play with higher level optimisations. Some
algorithms will benefit from extra inlining and unrolling, others will
suffer. That's why profile feedback is a valuable tool when evaluating
optimisations.
 
W

woodbrian77

Also, let me repeat, the slippery slope is in judging (or pretending to)
what it is that customers "need" and making it stand above what users
actually want (which is often expressed). Ignoring the customer
requests for the sake of providing them with what they "need" leads to
isolation and loss of market share, and there are plenty of examples.

OK, but on the other hand I don't think there are on line
alternatives to what I'm doing and if there were, they
would be hard pressed to make it available for free.

Talk to customers and offer them the choice between adding more features
[to existing workflows], or changing the paradigm, or squeezing another
5% out of the text segment or 0.5% of execution time.

My effort for 15 years assumes a paradigm change. I can't
change that now. Making it available for free is the best
I can do.
 
Ö

Öö Tiib

Do current (C++11) smart pointers handle inheritance hierarchies
properly?

i.e.:

class Base { virtual ~Base(); };
class Derived : public Base { };

shared_ptr<Base> bp;
shared_ptr<Derived> dp;

void f(shared_ptr<Base> p);
...

f(bp);
f(dp); // is this OK?

Yes, that cast is implicit AFAIK. Additionally there are (in
non-member interface of 'shared_ptr') 'static_pointer_cast',
'dynamic_pointer_cast' and 'const_pointer_cast' for casts when
the 'sp.get()' types may be cast.
 
Ö

Öö Tiib

The problem with C++'s generic smart pointers is that they are either
limited or very inefficient.

std::unique_ptr is very efficient, but limited in usage (because it
can't be used to share, or even deep-copy, the object among several
owners.)

Yes, you have to write deep-copy yourself but same applies to raw pointer.
Also generic solution is possible as light weight adapter around unique_ptr.
For shared non-owning pointer the 'weak_ptr' is meant but I have to
admit that it takes care (again adapters, 'auto' and 'typedef') to use
'weak_ptr' comfortably.
std::shared_ptr can be used to share the object, but is very inefficient.
(In cases where you just need a few of them that doesn't matter. It starts
to matter if you need millions of them, or you need to create and destroy
millions of them in short timespans.)

When it takes millions to affect performance then it is not exactly
"very inefficient". There is usually one type of many that we have in
millions.

On cases with millions of elements I may use 'vector', 'deque', 'array'
instead of allocating each element separately and/or have intrusive
reference counts and/or use 'boost::intrusive' containers and/or use
database engines. Usage of raw pointers is not the only solution.
Also, there currently exists no ready-made solution for copy-on-write,
which is sometimes the most efficient solution to some problems.

Yes. Who wants to have lockless copy-on-write has to write it. Pointer
to atomic refcount feels simplest candidate. If to propose one to boost
then it probably eventually lands in C++.

What I personally like is modification of smart pointers that can be
always dereferenced. When it is 'nullptr' then it dereferences fixed
static object of "missing" type that is logically immutable. Adapters
around existing smart pointers are fine enough. Efficiency difference is
minimal and usage code somewhat nicer (checks like 'if(p!=nullptr)' ,
'if(!p)' are rarely needed). It takes bit designing of pointed at type
to have values that can be "missing" but it seems to be worth of it.
 
Ö

Öö Tiib

In my experience this practice tends to hide logic errors - a wrong (dummy)
object gets used incorrectly, but this might not be readily apparent.

There can be logic errors regardless if something that can be missing is
missing or not. When it is not normal that something is missing then usage
of such type with missing value is itself logic error.
Instead, my smartpointers throw hard exceptions in case of "NULL
dereference".

That is valid strategy when lack of object is extremely rare but
possible. That is not so on general case. For example we have chest
that can have lock or not and both types are common.

Whatever key ring we try to use for unlocking chest without lock results
with "ok, but it is already unlocked". Whatever key ring we try to use
for locking that chest results with "locking failed, lock missing".
If that is signaled by exception or success value depends on interface
contract of such lockable object.
Static objects have a lot of nasty problems anyway, plus most of my
smartpointers and objects are single-thread-only and so would require
setting up thread-specific static dummy objects. Too much hassle for
questionable gain.

Static immutable objects? On the contrary. Totally thread safe and
totally robust. I said "logically immutable" in sense that mutators
in interface of "missing" object do not cause its state (that may
be nonexistent) to change. It behaves as lack of object would behave.
Everybody may own that "nothing" and no conflicts occur. ;)
 
Ö

Öö Tiib

I was not comparing C++'s smart pointers to raw pointers. I was comparing
them to a custom class that manages some resource.

I was only agreeing that neither does deep copy so whatever we use (raw
or unique) for deep copy we have to write it. How does it matter if in
context of custom class or some custom helper generic?
My intention was not to say that it takes millions for the difference
to show. My intention was to say that there's a big difference, and
it shows up especially when you need to have lots of them. If you have
literally millions of them, we can probably talk about an order of
magnitude of time difference (compared to a more efficient solution.)
This doesn't mean there's no difference with smaller amounts as well.

Are you sure that it is possible to achieve orders of magnitude slower
performance of program? I have observed over 2 times slower performance
on cruelly constructed case where 'std::make_shared<T>()' was not used.
Usually we use. When it is used then it is rather hard to get major
differences. Might be that platforms and implementations differ so
heavily?
It's not like any of the standard library containers are very
thread-safe either...

The standard containers can be used as owned by work and passed/moved
between threads. IOW there are usages without need for synchronization.
With copy-on-write it is also possible to avoid spreading the (not
done yet) copies between threads but that is logically harder to keep
and so it is likely better to build synchronization in.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top