B
Bo Persson
Frederick said:Bo Persson posted:
A std::string implementation can use the small string optimization,
to avoid dynamic allocation for some strings. If the string is
shorter than a certain size, and the size is known at compile time,
std::string can take advantage of that.
I've never heard of that... how would it be implemented? Something
like:
#include <cstddef>
template<std::size_t i>
std::string::string(char const (&str))
{
/* Something Funky... ? */
}
Something Funky.
Actually a combination of inlining, compiler intrinsics, and an
agressive optimizer. Visual C++ Express does this!
The exact code looks like this (inside basic_string):
__forceinline
basic_string(const value_type* _String,
const allocator_type& _Allocator =
allocator_type() )
: _Parent(_Allocator)
{
const size_type _StringSize = traits_type::length(_String);
if (_MySmallStringCapacity < _StringSize)
{
_Construct(_String, _StringSize);
}
else
{
traits_type::copy(_MySmallString._Buffer, _String,
_StringSize);
_SetSmallStringCapacity();
_SetSize(_StringSize);
}
}
Here traits_type::length contains a call to std::strlen, which is a
compiler intrinsic. If _String is a literal, this call is evaluated at
compile time.
If so, the condition in the if-statement is also a constant
expression, evaluating to false, so the else part is selected.
The traits_type::copy contains a call to std::memcpy, which is also an
intrinsic if all parameters are constant. It is inlined as one or more
mov instructions.
Here is an example from a test program with this constructor, followed
by a copy construction to a second string
std::string whatever = "abcd";
std::string whatever2 = whatever;
The compiler also takes advantage of the fact that register BL is
zero, and that EBP already contains the string length.
; 530 :
; 531 : std::string whatever = "abcd";
0080d a1 00 00 00 00 mov eax, DWORD PTR
??_C@_04EHKALCEN@abcd?$AA@
00 00 mov DWORD PTR _whatever$[esp+1792], eax
00 00 mov BYTE PTR _whatever$[esp+1819], bl
00 00 mov DWORD PTR _whatever$[esp+1820], ebp
00 00 mov BYTE PTR _whatever$[esp+1796], bl
; 532 :
; 533 : std::string whatever2 = whatever;
00 00 mov DWORD PTR _whatever2$[esp+1792], eax
00 00 mov BYTE PTR _whatever2$[esp+1819], bl
00 00 mov DWORD PTR _whatever2$[esp+1820], ebp
00 00 mov BYTE PTR _whatever2$[esp+1796], bl
This is of course a selected best case, but rather good
(understatement .
I have used this example before, when arguing that well tuned C++
library code not only defies the alleged template code bloat, but
actually can be both smaller and faster that portable C code. Not to
mention easier to use correctly than some combination of
strlen/malloc/free/strcpy/strcat.
Bo Persson