Repeated instantiation of a variable / performance?

  • Thread starter Robert Sturzenegger
  • Start date
R

Robert Sturzenegger

// Code sequence A
for (int i = 0; i < 10; ++i) {
int k = something();
// some more code which uses k
}

// Code sequence B
int k;
for (int i = 0; i < 10; ++i) {
k = something();
// some more code which uses k
}

Are the two sequences exactly the same in terms of performance, or has the
repeated instanciation of k in sequence A a certain cost compared with the
single one instantiation in sequence B?
What about if k were of a class type?
Thanks, Robert
 
J

John Harrison

// Code sequence A
for (int i = 0; i < 10; ++i) {
int k = something();
// some more code which uses k
}

// Code sequence B
int k;
for (int i = 0; i < 10; ++i) {
k = something();
// some more code which uses k
}

Are the two sequences exactly the same in terms of performance, or has
the
repeated instanciation of k in sequence A a certain cost compared with
the
single one instantiation in sequence B?
What about if k were of a class type?
Thanks, Robert

Who can say? One compiler could be different from the next. If you are
really interested then write the program and time it (or look at the
generated machine code). Personally I would be surprised to see any
difference, but then I've never really looked into it.

As for the class version then it depends upon the class (and on the
compiler). You are comparing the cost of assignment with the cost of copy
construction. Which is more efficient depends entirely on how the class is
written, There is no a priori reason to expect either to be more efficient.

john
 
N

Niels Dybdahl

// Code sequence A
for (int i = 0; i < 10; ++i) {
int k = something();
// some more code which uses k
}

// Code sequence B
int k;
for (int i = 0; i < 10; ++i) {
k = something();
// some more code which uses k
}

Are the two sequences exactly the same in terms of performance, or has the
repeated instanciation of k in sequence A a certain cost compared with the
single one instantiation in sequence B?

The compilers I have used (Borland, Microsoft) on Windows do allocate the
space for k when the function starts. Actually there is an x86 assembly
instruction for that, so you can assume that all wellwritten compilers do
that. So for simple types there will be no difference.
What about if k were of a class type?

As John mentions the first is a construction and the second an assignment.
Some classes will do the same in those cases. Others will do some more
initialization in the construction case. A few might do more work in the
assignment case.
I would prefer the construction case because of clarity caused by the
limited scope, unless I know for sure that the construction of the object is
much more expensive than an assignment.

Niels Dybdahl
 
J

John Harrison

As John mentions the first is a construction and the second an assignment.
Some classes will do the same in those cases. Others will do some more
initialization in the construction case. A few might do more work in the
assignment case.

Yes, I should have said the OP is comparing the cost of repeated copy
construction and destruction, with the cost of repeated assignment. It's
unlikely that assignment will be less efficient. But does return value
optimization play a role here? It seems to me that the compiler might be
able to do away with a temporary in the copy construction case. If so that
would swing things back in favour of copy construction.
I would prefer the construction case because of clarity caused by the
limited scope, unless I know for sure that the construction of the object is
much more expensive than an assignment.

Absolutely, in general clarity of code is the most important efficiency
saving of all.

john
 
J

JKop

Robert Sturzenegger posted:
// Code sequence A
for (int i = 0; i < 10; ++i) {
int k = something();
// some more code which uses k
}

// Code sequence B
int k;
for (int i = 0; i < 10; ++i) {
k = something();
// some more code which uses k
}

Are the two sequences exactly the same in terms of performance, or has
the repeated instanciation of k in sequence A a certain cost compared
with the single one instantiation in sequence B?
What about if k were of a class type?
Thanks, Robert

I can guarantee that if one of them was faster, it would be B.

-JKop
 
M

Michiel Salters

Robert Sturzenegger said:
// Code sequence A
for (int i = 0; i < 10; ++i) {
int k = something();
// some more code which uses k
}

// Code sequence B
int k;
for (int i = 0; i < 10; ++i) {
k = something();
// some more code which uses k
}

Are the two sequences exactly the same in terms of performance, or has the
repeated instanciation of k in sequence A a certain cost compared with the
single one instantiation in sequence B?

With current compilers, exactly the same. In both cases compilers simply
reserve sizeof(int) bytes on the stack. This happens at compile time.
What about if k were of a class type?

In this case, it's unpredicatble. The first case has 10 ctor calls and
10 dtor calls. The second case has one ctor call, one dtor, and 10
assignments. The naive assumption would be that the first is more
expensive, but exceptio-safe assignments are usually implemented as
{ create temporary, swap contents, destroy temporary } which means
it includes both a ctor and dtor call. Then the second case is
more expensive.

Other cases may be even more complex. E.g. if k has type std::string,
the relative performance depends critically on the length of the
10 strings returned by something().

Regards,
Michiel Salters
 
J

Jerry Coffin

[ ... ]
I can guarantee that if one of them was faster, it would be B.

It's sad that somebody like you seems to feel obliged to spend
inordinate amounts of time dreaming up wrong answers to give to even
ridiculously simple quesitons.

Fortunately, there is one good point: if you weren't dreaming up wrong
answers to give here, you'd probably be using one of your stolen
compilers to write code. Based on what you post here, anything that
prevents you from writing code HAS to be a good thing, even if it
means that beginners have to start out with a trial by fire (so to
speak) and quickly learn whose answers to ignore at nearly any cost.
 
M

Mark A. Gibbs

Michiel said:
With current compilers, exactly the same. In both cases compilers simply
reserve sizeof(int) bytes on the stack. This happens at compile time.

that makes no sense. stack allocations cannot happen at compile time.
think about it.

conceptually, sequence A would allocate stack space on each iteration
and clean it up at the end of the loop, where B would allocate the space
before the loop. i doubt that's what actually happens though, most
likely the compiler would adjust the stack before the loop. of course,
that's implementation dependent (hell, the compiler might even optimize
k away into a register in both sequences), but if it is the case, then
there should be no difference between the two.

mark
 
J

JKop

Jerry Coffin posted:
[ ... ]
I can guarantee that if one of them was faster, it would be B.

It's sad that somebody like you seems to feel obliged to spend
inordinate amounts of time dreaming up wrong answers to give to even
ridiculously simple quesitons.

Then take some anti-depressants and get over it.
Fortunately, there is one good point: if you weren't dreaming up wrong
answers to give here, you'd probably be using one of your stolen
compilers to write code.

I'll be doing that anyway.
Based on what you post here, anything that prevents you from writing
code HAS to be a good thing, even if it means that beginners have to
start out with a trial by fire (so to speak) and quickly learn whose
answers to ignore at nearly any cost.

It's sad that somebody like you seems to feel obliged to spend inordinate
amounts of time responding to people you don't like.


-JKop
 
H

Howard

JKop said:
Robert Sturzenegger posted:


I can guarantee that if one of them was faster, it would be B.

How can you guarantee that? (And do I get my money back if you're wrong?
:))

I'll make a guess that maybe you're talking about the integer case only. In
which, case, there's *probably* no difference in any modern compiler.

If you're including the object case as well, I'd say that guarantee ain't
worth the paper it ain't written on. (So to speak. :)) As mentioned
elsewhere in this conversation, in the object case, B *might* actually be
*slower*, due to temporaries being constructed/destructed in addition to the
assignment.

In any case, it's not something that the standard defines, but rather is
implementation dependent.

-Howard
 
J

JKop

Howard posted:
If you're including the object case as well, I'd say that guarantee
ain't worth the paper it ain't written on. (So to speak. :)) As
mentioned elsewhere in this conversation, in the object case, B *might*
actually be *slower*, due to temporaries being constructed/destructed
in addition to the assignment.


Now I see what you're getting at.


A will be calling the copy constructor.

B will be calling the assignment operator.


With A, there will be an object copy-constructed upon each iteration of the
loop, which may be copied from a temporary.

With B, there will be one initial object constructed. Then, with each
iteration of the loop, the assignment operator will be called, which may be
called with a temporary.


If given that the copy constructor and the assignment operator are near
enough identical in speed, A has the only extra cargo of allocating and
deallocating memory upon each iteration of the loop.


From that, I would guarantee that if either were faster, that it would be
B... except in the circumstance that the defined assignment operator is a
lot slower than the defined copy constructor.


Any thoughts?


-JKop
 
J

JKop

There isn't enough code supplied to determine if binding a reference to a
temporary would be preferable at all.

-JKop
 
H

Howard

JKop said:
Howard posted:



Now I see what you're getting at.


A will be calling the copy constructor.

B will be calling the assignment operator.


With A, there will be an object copy-constructed upon each iteration of the
loop, which may be copied from a temporary.

With B, there will be one initial object constructed. Then, with each
iteration of the loop, the assignment operator will be called, which may be
called with a temporary.


If given that the copy constructor and the assignment operator are near
enough identical in speed, A has the only extra cargo of allocating and
deallocating memory upon each iteration of the loop.


From that, I would guarantee that if either were faster, that it would be
B... except in the circumstance that the defined assignment operator is a
lot slower than the defined copy constructor.


Any thoughts?

I actually wasn't referring to the speed of the assignment versus the copy
constructor, but rather to the fact that in case B, the compiler may choose
to construct a termporary variable, then do the assignment operator, and
then destruct the temporary variable. Whereas with A, it might
copy-construct and later destruct the variable. If that's how the compiler
implemented the two, you can see the extra work being done in case B now,
right? But as I said, we can't guarantee how any implementation writer
might *choose* to implement that (since the standard doesn't specify), so we
don't know the answer except by testing on specific compilers/platforms.

-Howard
 
J

JKop

Howard posted:

I actually wasn't referring to the speed of the assignment versus the
copy constructor, but rather to the fact that in case B, the compiler
may choose to construct a termporary variable, then do the assignment
operator, and then destruct the temporary variable. Whereas with A, it
might copy-construct and later destruct the variable. If that's how
the compiler implemented the two, you can see the extra work being done
in case B now, right? But as I said, we can't guarantee how any
implementation writer might *choose* to implement that (since the
standard doesn't specify), so we don't know the answer except by
testing on specific compilers/platforms.

Why don't you think that with A, it might copy-contruct from a temporary,
just as how with B, it might do an assignment from a temporary?

Anyway, there's more considerations. For instance, if we make k const,
then we can turn k into a reference and bind it to the temporary returned
from something(). But then again maybe the programmer wants to edit k, but
we don't have complete code so we don't know.

Anyway, I'm off to play with getting G++ to spit out some assembly for me!


-JKop
 
I

Idriz Smaili

Robert said:
// Code sequence A
for (int i = 0; i < 10; ++i) {
int k = something();
// some more code which uses k
}

// Code sequence B
int k;
for (int i = 0; i < 10; ++i) {
k = something();
// some more code which uses k
}

Are the two sequences exactly the same in terms of performance, or has the
repeated instanciation of k in sequence A a certain cost compared with the
single one instantiation in sequence B?
What about if k were of a class type?
Thanks, Robert
Suppose that you don't have to do with the primitive C data type, as k
in this case is, but you have to do with a user-defined type, i.e.,
class. Furthermore, 'do_something' has to return an object of the same
type as k is. In this case you will have:
SCENARIO I:
1. one cstor and dstor for the k;
2. and for each iteration one cstor, one dstor + one call to the
assignment operator (which should be always implemented) during
creation, the assignment process and during destruction (after
assigment) of the temporary object generated by 'something ()'.

SCENARIO II:
However, in the second scenario you will have the following list of calls:
for each iteration:
1. one cstor + dstor for the object k;
2. same as SCENARIO I, point 2.

As above presented for non primitive data types the second scenario
would have much more invocations of cstor and dstor for the object k.
However, this should of course not be the decision factor to select
between these two scenarios. It depends on the goal of the "algorithm"
in which the code is used.

Idriz Smaili
 
R

Richard Herring

In message
Mark said:
that makes no sense. stack allocations cannot happen at compile time.

Why not? The compiler knows all about the auto variables in each scope -
hence, it can calculate where on the stack each one will be, relative to
the base of the stack frame. It doesn't necessarily have to use 'push'
and 'pop' instructions to put the data there. Some hardware calling
conventions even provide a separate register for the frame base.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top