unexpected abstraction penalty in C++

alex goldman · May 29, 2005

class c {
int x;
public:
inline c() : x(0) {}
inline c(int i) { x = i; }
inline operator const int& () const { return x; }
inline operator int& () { return x; }
};

If I use objects of type `c' as if they were int's, I see about 1.5x
slow-down compared to using regular int's (timed using GCC-3.3.4 -O3). Is
it reasonable to expect modern compilers to optimize this, so that there is
no abstraction penalty?

Alf P. Steinbach · May 29, 2005

* alex goldman:

class c {
int x;
public:
inline c() : x(0) {}
inline c(int i) { x = i; }
inline operator const int& () const { return x; }
inline operator int& () { return x; }
};

'inline' is superflous here, because the functions are defined in-class
which makes the automatically 'inline'.

Second, 'inline' is at best only a hint.

To direct your compiler to optimize for speed, use your compiler's options
and/or non-standard language extensions.

If I use objects of type `c' as if they were int's, I see about 1.5x
slow-down compared to using regular int's (timed using GCC-3.3.4 -O3). Is
it reasonable to expect modern compilers to optimize this, so that there is
no abstraction penalty?

Don't know.

With the following main program:

int main( int nArgs, char* arg[] )
{
int forceCode = atoi( arg[1] );

int x = 0x1234 + forceCode;
x += 0x1111;
std::cout << x << std::endl;

c y = 0x1234 + forceCode;
y += 0x1111;
std::cout << y << std::endl;
}

Visual C++ 7.1, optimize for speed or full optimization, emits a
single instruction for the initialization of and addition to 'x',

; int x = 0x1234 + forceCode;
; x += 0x1111;
lea edx,[esi+2345h]

in contrast to two instruction for the code using your class, 'y',

; c y = 0x1234 + forceCode;
lea eax,[esi+1234h]
; y += 0x1111;
add eax,1111h

I don't understand why the compiler can see the optimization for 'x'
but not for 'y', having gone so far as to represent them identically.

Chris Theis · May 29, 2005

Alf said:
* alex goldman:

class c {
int x;
public:
inline c() : x(0) {}
inline c(int i) { x = i; }
inline operator const int& () const { return x; }
inline operator int& () { return x; }
};

Click to expand...

'inline' is superflous here, because the functions are defined in-class
which makes the automatically 'inline'.

Second, 'inline' is at best only a hint.

To direct your compiler to optimize for speed, use your compiler's options
and/or non-standard language extensions.

If I use objects of type `c' as if they were int's, I see about 1.5x
slow-down compared to using regular int's (timed using GCC-3.3.4 -O3). Is
it reasonable to expect modern compilers to optimize this, so that there is
no abstraction penalty?

Click to expand...

Don't know.

With the following main program:

int main( int nArgs, char* arg[] )
{
int forceCode = atoi( arg[1] );

int x = 0x1234 + forceCode;
x += 0x1111;
std::cout << x << std::endl;

c y = 0x1234 + forceCode;
y += 0x1111;
std::cout << y << std::endl;
}

Visual C++ 7.1, optimize for speed or full optimization, emits a
single instruction for the initialization of and addition to 'x',

; int x = 0x1234 + forceCode;
; x += 0x1111;
lea edx,[esi+2345h]

in contrast to two instruction for the code using your class, 'y',

; c y = 0x1234 + forceCode;
lea eax,[esi+1234h]
; y += 0x1111;
add eax,1111h

I don't understand why the compiler can see the optimization for 'x'
but not for 'y', having gone so far as to represent them identically.

IMHO it's related to the point of time when the compiler's constant
folding kick in. I'd expect it to operate primarily on POD leaving out
objects (even though they might act the way the OP implements them). I
suppose this could impose overhead in the dataflow analysis which
probably won't pay off in most cases.

Cheers
Chris

Ron Natalie · May 29, 2005

alex said:
class c {
int x;
public:
inline c() : x(0) {}
inline c(int i) { x = i; }
inline operator const int& () const { return x; }
inline operator int& () { return x; }
};

If I use objects of type `c' as if they were int's, I see about 1.5x
slow-down compared to using regular int's (timed using GCC-3.3.4 -O3). Is
it reasonable to expect modern compilers to optimize this, so that there is
no abstraction penalty?

Well you don't show us your code that actually uses the class, but there
are a few differences between
int [100];
and
c [100];

The first is that with an array of ints (or any POD) the default
initialization is either static or omitted. In your case,
a dynamic initialization always occurs.

Donovan Rebbechi · May 29, 2005

class c {
int x;
public:
inline c() : x(0) {}
inline c(int i) { x = i; }
inline operator const int& () const { return x; }
inline operator int& () { return x; }
};

If I use objects of type `c' as if they were int's, I see about 1.5x
slow-down compared to using regular int's (timed using GCC-3.3.4 -O3). Is
it reasonable to expect modern compilers to optimize this, so that there is
no abstraction penalty?

Does it make any difference if you change that to operator int() ?

Cheers,

alex goldman · May 29, 2005

Ron said:
In your case, a dynamic initialization always occurs.

That's a very good point. I removed the initialization just to be sure, but
it didn't make a difference for speed. It's still 2:3 for int vs the class.

Here's one of the benchmarks I ran:

#include <iostream>

class c {
int x;
public:
c() {}
c(int i) { x = i; }
operator const int& () const { return x; }
operator int& () { return x; }
};

#define LOOP1(i, n) for((i) = -(n); (i) <= (n); ++(i))
#define LOOP(i, j, k, n, a) \
a = 0; \
LOOP1(i, n) LOOP1(j, n) LOOP1(k, n) \
a += k + j; return a

int f_i(int n) {
int i, j, k, acc;
LOOP(i, j, k, n, acc);
}

int f_c(c n) {
c i, j, k, acc;
LOOP(i, j, k, n, acc);
}

int main() {
// std::cout << f_i(1000) << '\n';
std::cout << f_c(1000) << '\n';
}

alex goldman · May 29, 2005

Donovan said:
Does it make any difference if you change that to operator int() ?

I can't change the non-const operator to int (), but changing the const one
makes no difference.

Ron Natalie · May 30, 2005

alex said:
Ron Natalie wrote:

That's a very good point. I removed the initialization just to be sure, but
it didn't make a difference for speed. It's still 2:3 for int vs the class.

You might try adding operator += and ++ to see what difference that
makes, or alternatively a operator=(int).

alex goldman · May 31, 2005

I wrote a more comprehensive test of various abstraction penalties in C++.
Here's what I get on P4 with GCC-3.3.4 -O3 (agressive optimization &
inlining):

$ time ./a.out

f_int(1000) : 9.240s
f_class1(1000) : 13.890s
f_class2(1000) : 19.510s
f_method(1000) : 13.850s
f_macro(1000) : 9.320s
f<int>(1000) : 9.240s
f<c>(1000) : 19.490s
f_get1(1000) : 13.850s
f_get2(1000) : 32.660s

real 2m21.092s
user 2m20.928s
sys 0m0.133s

Lessons learned:
* regular accessors (getters & setters) didn't help
* very minor things can confuse the optimizer (class1 vs class2)

I'm be curious to know how other CPUs/compilers do. Program text follows.

#include <iostream>
#include <ctime>
#include <iomanip>

using namespace std;

double time() { return double(clock()) / CLOCKS_PER_SEC; }

#define TIME_INC(e, res) { \
double t1 = time(); \
(res) += (e); \
double t2 = time(); \
cout << setw(15) << #e << " : " \
<< setw(7) << fixed << setprecision(3) \
<< t2 - t1 << "s" << endl; \
}

class c {
public:
int x;
c() {}
c(int i) : x(i) {}
operator const int& () const { return x; }
operator int& () { return x; }
const int& i() const { return x; }
int& i() { return x; }
int get() const { return x; }
void set(int i) { x = i; }
};

#define LOOP1(i, n) for((i) = -(n); (i) <= (n); ++(i))
#define LOOP(i, j, k, n, a) \
a = 0; \
LOOP1(i, n) LOOP1(j, n) LOOP1(k, n) \
a += k + j; return a

int f_int(int n) {
int i, j, k, acc;
LOOP(i, j, k, n, acc);
}

int f_class1(c n) {
c i, j, k, acc;
LOOP(i, j, k, n, acc);
}

// the return type is different from the above

c f_class2(c n) {
c i, j, k, acc;
LOOP(i, j, k, n, acc);
}

int f_method(c n) {
c i, j, k, acc;
LOOP(i.i(), j.i(), k.i(), n.i(), acc.i());
}

// very similar, but 1.5x faster!

#define I(e) (e).x
int f_macro(c n) {
c i, j, k, acc;
LOOP(I(i), I(j), I(k), I(n), I(acc));
}

template<class T>
T f(T n) {
T i, j, k, acc;
LOOP(i, j, k, n, acc);
}

int f_get1(c n) {
c i, j, k, acc = 0;
for(i.set(-n.get()); i.get() <= n.get(); i.set(i.get() + 1))
for(j.set(-n.get()); j.get() <= n.get(); j.set(j.get() + 1))
for(k.set(-n.get()); k.get() <= n.get(); k.set(k.get() + 1))
acc.set(acc.get() + k.get() + j.get());
return acc;
}

// the return type makes a big difference:

c f_get2(c n) {
c i, j, k, acc = 0;
for(i.set(-n.get()); i.get() <= n.get(); i.set(i.get() + 1))
for(j.set(-n.get()); j.get() <= n.get(); j.set(j.get() + 1))
for(k.set(-n.get()); k.get() <= n.get(); k.set(k.get() + 1))
acc.set(acc.get() + k.get() + j.get());
return acc;
}

#define TIME(e) TIME_INC(e(1000), dummy)

int main() {
int dummy = 0;
TIME(f_int);
TIME(f_class1)
TIME(f_class2)
TIME(f_method);
TIME(f_macro);
TIME(f<int>);
TIME(f<c>);
TIME(f_get1);
TIME(f_get2);
return dummy;
}

Lexical Analysis on C++	1	Oct 31, 2023
Filter sober in c++ don't pass test	0	Dec 2, 2023
The container abstraction and parallel programming	38	Jan 6, 2012
Performance penalty for encapsulations ??	1	Sep 6, 2003
Help optimize nbody bench program (c++ sse2 intrinsics)	3	Oct 12, 2012
Various template approaches to avoid pointer to function penalty	5	Feb 26, 2007
How to keep count of right answer and wrong answers in C++?	0	Nov 3, 2021
matrix mult with const in c++	0	Aug 1, 2011

unexpected abstraction penalty in C++

alex goldman

Alf P. Steinbach

Chris Theis

Ron Natalie

Donovan Rebbechi

alex goldman

alex goldman

Ron Natalie

alex goldman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads