Is there systematic performance comparison of std::string and c style string?

Y

yu_kuo

Is there any comparison data on perfomance difference between
std::string and c style string? Or maybe if there are source code
which could be used to measuer on different compiler/platform, in a
systematic way?
 
G

Guest

Is there any comparison data on perfomance difference between
std::string and c style string? Or maybe if there are source code
which could be used to measuer on different compiler/platform, in a
systematic way?

Most certainly there is, google is your friend. I believe if you include
the word rope in the search you'll find some. Remember though that all
the benchmarks in the world does not mean shit if they don't measure
what you need, so instead of looking for other peoples benchmarks
perform your own with the code you need to run.
 
Y

yu_kuo

Most certainly there is, google is your friend. I believe if you include
the word rope in the search you'll find some. Remember though that all
the benchmarks in the world does not mean shit if they don't measure
what you need, so instead of looking for other peoples benchmarks
perform your own with the code you need to run.

Thanks for your reply and suggestion. Actually I myself am prety
convinced to prefer std::string, but it's simply not a common sense of
my colleagues. That's why I'm searching for hard proofs to convince
people. I did googled for some time, but didn't find what I can
directly use. Any way I could write some code to compare functionality
we are interested, just as you have suggested.

Regards,
Kevin
 
?

=?ISO-8859-1?Q?Erik_Wikstr=F6m?=

Thanks for your reply and suggestion. Actually I myself am prety
convinced to prefer std::string, but it's simply not a common sense of
my colleagues. That's why I'm searching for hard proofs to convince
people. I did googled for some time, but didn't find what I can
directly use. Any way I could write some code to compare functionality
we are interested, just as you have suggested.

I might have miss understood you original question a bit, I thought that
you wanted to know which was the best performer for a certain kind of
use (such as really large amounts of strings or really large strings) in
which case there might sometimes be some benefits to using C-strings.

But if you mean usage of std::string vs. C-strings in general then I'm
very hard pressed to come up with any argument in favour of C-strings
but it's quite easy to find arguments for std::string (ease of use, no
risk of overflows, don't have to allocate memory manually etc.). In fact
one very good argument against using C-strings is that most buffer
overflow attacks are caused by improper usage of C-strings, and if
std::string had been used instead the code would have been much simpler
and more safe. As for speed, I'd say that std::string is fast enough for
most usages and if your application is an exception you'd probably know
it due to profiling and benchmarks you've already done.
 
J

Jim Langston

Is there any comparison data on perfomance difference between
std::string and c style string? Or maybe if there are source code
which could be used to measuer on different compiler/platform, in a
systematic way?

In my own testing the overhead of std::string .vs. c-style strings was
measured in microseconds. I.E. very negligable.
 
J

Jim Langston

Jim Langston said:
In my own testing the overhead of std::string .vs. c-style strings was
measured in microseconds. I.E. very negligable.

Wait,not micro, the one that is smaller than nano. Lets see, mili, micro,
nano, ... umm.. dang.
 
J

Jim Langston

Victor Bazarov said:
Pico? Atto? Femto?

Pico, that's it. It took about 4 pico seconds longer to allocate a
std::string than to use a c-style array in my testing. Negligable for any
application.
 
J

James Kanze

In my own testing the overhead of std::string .vs. c-style strings was
measured in microseconds. I.E. very negligable.

My own testing found several orders of magnitude. Developing
something using C style strings might take a week, where with
std::string, it would be a couple of hours.
 
J

James Kanze

Pico, that's it. It took about 4 pico seconds longer to
allocate a std::string than to use a c-style array in my
testing. Negligable for any application.

Picosecond differences are probably less than the resolution of
your measurement system; it would be more accurate to say that
you found no measurable difference. But that still doesn't tell
us anything, because we don't know what you were measuring.

Note too that for any given activity, the implementation of
std::string can make a significant difference. For some things,
the implementation in g++ is significantly faster than that in
VC++, for others, the reverse is true. (G++ uses reference
counting; VC++ deep copy with the small string optimization. If
you don't copy much, and most of your strings are short, VC++
will be faster; if you copy long strings a lot, g++.)
 
Y

yu_kuo

I might have miss understood you original question a bit, I thought that
you wanted to know which was the best performer for a certain kind of
use (such as really large amounts of strings or really large strings) in
which case there might sometimes be some benefits to using C-strings.

But if you mean usage of std::string vs. C-strings in general then I'm
very hard pressed to come up with any argument in favour of C-strings
but it's quite easy to find arguments for std::string (ease of use, no
risk of overflows, don't have to allocate memory manually etc.). In fact
one very good argument against using C-strings is that most buffer
overflow attacks are caused by improper usage of C-strings, and if
std::string had been used instead the code would have been much simpler
and more safe. As for speed, I'd say that std::string is fast enough for
most usages and if your application is an exception you'd probably know
it due to profiling and benchmarks you've already done.

Actually we are dealing with telecomunication protocols, like SIP and
Diameter. Most contents are now text based, so we are right dealing
with large amount of strings, and sometimes the string can be large
(To about Mega bytes). Most operation on string would be copy, find
and concatecation, very little modify or replace operation.

And std::string haven't been widely used yet, so I can't just change
it overnight. That's why I have to do some work outside our
application, and the analysis should better cover as much operations
on strings of different length. That's not a very simple work to do,
I'm just lazy and wonder if somebody had done that.

Regards,
Kevin
 
T

tragomaskhalos

Note too that for any given activity, the implementation of
std::string can make a significant difference. For some things,
the implementation in g++ is significantly faster than that in
VC++, for others, the reverse is true. (G++ uses reference
counting; VC++ deep copy with the small string optimization. If
you don't copy much, and most of your strings are short, VC++
will be faster; if you copy long strings a lot, g++.)

I've also noticed a lot of code out there where people
do this:
void foo(std::string s) { ... }

where they could be doing this instead:
void foo(const std::string& s) { ... }

Presumably this is a Java / C# influence, but appied
systematically across a codebase with a "copying
std::string" library, this is going to involve a lot
of extra copying for zero benefit, which I suspect
cannot be optimised away in the general case. I
wonder how many "performance problems" reported
with std::string could be eliminated by correcting
this usage.

I think it's "More Exceptional C++" by Herb Sutter
that has an appendix comparing different types of
string implementation strategy, including problems
with some implementations of reference counting.
 
F

Frank Birbacher

Hi!

Actually we are dealing with telecomunication protocols, like SIP and
Diameter. Most contents are now text based, so we are right dealing
with large amount of strings, and sometimes the string can be large
(To about Mega bytes). Most operation on string would be copy, find
and concatecation, very little modify or replace operation.

Try using std::eek:stringstream for concatenation of multiple strings,
because the std::string::eek:perator + will not allocate more memory than
is necessary to hold just the two operands. That is, compare:
s1 + s2 + s3 + " WHERE " + s4 + s5
with:
stream << s1 << s2 << s3 << " WHERE " << s4 << s5;
which may be faster. (AFAIK, Java is doing such a conversion from
operator + to streams (StringBuffer in Java) automatically when compiling.)

Frank
 
F

Frank Birbacher

Hi!

James said:
My own testing found several orders of magnitude. Developing
something using C style strings might take a week, where with
std::string, it would be a couple of hours.

Good point! I like that. :D

Frank
 
G

Glyn Davies

Frank said:
Hi!



Good point! I like that. :D

Frank

Better suggestion is to develop with std::string initially, then
optimise where necessary. (Rational Quantify is a great tool!)

On one project by switching to stack based 'C' strings in some inner
loops I managed to knock 50% off the application startup.

Another thing to do is reserve() at least a good apromixation of the
final string. Biggest speed difference with C/vs std::string is down
to the memory allocation / free-ing.

Just my 2p

Glyn
 
J

James Kanze

Frank Birbacher wrote:
Better suggestion is to develop with std::string initially, then
optimise where necessary. (Rational Quantify is a great tool!)

I'm tempted to say: that has nothing to do with std::string. In
general, write clean, understandable code, with rigorous
encapsulation. Then, if it's not fast enough, use the profiler
to see where the problem is, and correct only that.

Note that the most important single aspect for performance
critical code is encapsulation. Because without good
encapsultation, trying to change anything, once you've found the
problem, can be hell.
On one project by switching to stack based 'C' strings in some
inner loops I managed to knock 50% off the application
startup.

Possibly changing to a different std::string implementation
could have had a similar effect; a lot depends on what you are
doing. Or, if you have a fixed upper limit to the length you
need, create your own fixed length strings. (A lot of
applications, like those I currently work on, write their data
to a data base. If the field in the data base is varchar(20),
then a fixed length string of length 20 is more appropriate than
std::string. And in fact, our current implementation does use
FixedString... a template on the actual length.)
Another thing to do is reserve() at least a good apromixation of the
final string. Biggest speed difference with C/vs std::string is down
to the memory allocation / free-ing.

Again, it depends. If most of youre strings are short, and the
implementation you are using uses the small string optimization,
you may never have a dynamic allocation/free. Typically, I
suspect that you're right in a lot of cases. But I wouldn't
assume so until I'd actually profiled it.
 
G

Glyn Davies

James said:
I'm tempted to say: that has nothing to do with std::string. In
general, write clean, understandable code, with rigorous
encapsulation. Then, if it's not fast enough, use the profiler
to see where the problem is, and correct only that.

Probably - I guess I've strayed away from the string argument to more
general territory. My only excuse is that the app we are talking about,
and the app I was working on back then did a lot of string manipulation
(many megabytes of XML munging)

Note that the most important single aspect for performance
critical code is encapsulation. Because without good
encapsultation, trying to change anything, once you've found the
problem, can be hell.

I'd go with Keep It Simple Stupid, and having a good overall design.
Encapsulation can help, but as with all these things there is no
panacea.
Possibly changing to a different std::string implementation
could have had a similar effect; a lot depends on what you are
doing. Or, if you have a fixed upper limit to the length you
need, create your own fixed length strings. (A lot of
applications, like those I currently work on, write their data
to a data base. If the field in the data base is varchar(20),
then a fixed length string of length 20 is more appropriate than
std::string. And in fact, our current implementation does use
FixedString... a template on the actual length.)

Yep, agreed. A good plan.
Again, it depends. If most of youre strings are short, and the
implementation you are using uses the small string optimization,
you may never have a dynamic allocation/free. Typically, I
suspect that you're right in a lot of cases. But I wouldn't
assume so until I'd actually profiled it.

What we found (only after profiling) was that the small string
functionality was just too small. We could have looked at other STL
implementations (I think we did briefly.) But this was all cross
platform anyway, so a simple cross platform solution was a winner.

Personally I always just use std::string, with judicious use of & and
const. I'll only go back to it if there is a performance issue that
needs addressing.

The reserve issue I mentioned above came in where there was a lot of
string concatenation going on. A quick check of lengths + a reserve
stopped a lot of churn, and improved performance no end.

Cheers,

Glyn
 
F

Frank Birbacher

Hi!

Glyn said:
The reserve issue I mentioned above came in where there was a lot of
string concatenation going on. A quick check of lengths + a reserve
stopped a lot of churn, and improved performance no end.

I'm using std::eek:stringstream in case of concatenation. It usually
performs better than std::string::eek:perator + when you can't do a resize
because the length is unknown. And you can also to formatting. I'm
actually using this to construct SQL statements with filled in values
like ints.

Frank
 
J

Jorgen Grahn

I've also noticed a lot of code out there where people
do this:
void foo(std::string s) { ... }

where they could be doing this instead:
void foo(const std::string& s) { ... }

Presumably this is a Java / C# influence

Oh yes. I have battled one Java developer who did this on a regular
basis (in C++).

It's ironic, since those languages are all about passing references ...
but appied
systematically across a codebase with a "copying
std::string" library, this is going to involve a lot
of extra copying for zero benefit, which I suspect
cannot be optimised away in the general case. I
wonder how many "performance problems" reported
with std::string could be eliminated by correcting
this usage.

If they do it to std::string, they probably do it to all kinds of
objects, and strings become their smallest problem ...

/Jorgen
 
J

Jim Langston

Pico, that's it. It took about 4 pico seconds longer to
allocate a std::string than to use a c-style array in my
testing. Negligable for any application.

Picosecond differences are probably less than the resolution of
your measurement system; it would be more accurate to say that
you found no measurable difference. But that still doesn't tell
us anything, because we don't know what you were measuring.

Note too that for any given activity, the implementation of
std::string can make a significant difference. For some things,
the implementation in g++ is significantly faster than that in
VC++, for others, the reverse is true. (G++ uses reference
counting; VC++ deep copy with the small string optimization. If
you don't copy much, and most of your strings are short, VC++
will be faster; if you copy long strings a lot, g++.)

=========

Yes, I had to measure doing it a few million times to tell the difference
between a c-style string and std::string. And results can be thrown off
depending on how the compiler decides to optimize the code. But I was able
to get a difference of around 4 pico seconds using std::string than a
c-style string when I averaged the results. And I found it negligable.

I did this to show someone who was insistant on using c-style strings than
std::strings because he thought std::strings were slower. Showing him the
code and results he agreed that std::strings were usable.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top