std::copy, backward_copy and fill implementations

D

Diego Martins

I want to get rid of memcpy, memmove and memset from my sources and
replace them with corresponding algorithms.

But I want not compromise performance when, for example, using arrays
or vectors of standard types.

How can I assure my compiler (with optimizations turned on, of course),
will map gracefully these algorithms to fast microprocessor string
opcodes? (MOVS, STOS..)

Any tips? Do you know a document that describe STL implementations on
most known compilers? (VC++, gcc, mingw...)

Diego Martins
HP
 
L

ldh

Diego said:
I want to get rid of memcpy, memmove and memset from my sources and
replace them with corresponding algorithms.

But I want not compromise performance when, for example, using arrays
or vectors of standard types.

How can I assure my compiler (with optimizations turned on, of course),
will map gracefully these algorithms to fast microprocessor string
opcodes? (MOVS, STOS..)

Any tips? Do you know a document that describe STL implementations on
most known compilers? (VC++, gcc, mingw...)

Diego Martins
HP

I think your best best is just to make the change and trust the
compilers. The STL is used so heavily these days that the compile
writers do the right thing. You can be pretty sure you won't notice a
performance difference at all. If you want to make sure, just look at
the compiler include files -- since the STL is a template library, the
code has to be visible to you. For instance, in the gcc implementation
of copy, there are a large number of auxiliary functions to dispatch
the copy as appropriate for the type being copied. Here is one of them:

template<typename _Tp>
inline _Tp*
__copy_trivial(const _Tp* __first, const _Tp* __last, _Tp*
__result)
{
std::memmove(__result, __first, sizeof(_Tp) * (__last -
__first));
return __result + (__last - __first);
}

So you can see that under the right conditions, the copy() algorithm
will turn into an inline call to memmove.

If you are just using memcpy on POD types, the only difference after
the switch will be the use of memmove instead of memcpy, which is
necessary because copy() allows the input and output ranges to overlap.

-Lewis
 
O

Ole Nielsby

ldh said:
I think your best best is just to make the change and trust the
compilers. The STL is used so heavily these days that the compile
writers do the right thing.

And, compilers are used so heavily (and have been for a while)
that chip makers do the right thing - the processors are optimized
for the simple instructions compilers tend to produce. The string
opcodes aren't as fast as you might think.

I learnt this from assembler programming: I used the string
instructions a lot untill I found out my code was faster without
them.

(I wrote a binary search for a sorted pointer array, and supplemented
it with a simple linear search using SCASD, which I figured would
be faster for small arrays - but I was wrong. The binary search
was faster even at two elements, at one element it was a close
tie. It seems beasts like SCASD are more or less considered
deprecated by the chip designers, especially at Intel. So don't
worry if your C++ compiler doesn't generate them.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,677
Members
48,796
Latest member
Greg L.

Latest Threads

Top