benchmarks of char* ops vs. std::string

R

roberts.noah

Are there any decent benchmarks of the difference between using c
strings and std::string out there? Google isn't being friendly about
it. Obviously this would be dependant on different implementations but
I don't care. I would be happy to find ANY comparison at this point if
it was valid and semi scientifically done.
 
M

mlimber

Are there any decent benchmarks of the difference between using c
strings and std::string out there? Google isn't being friendly about
it. Obviously this would be dependant on different implementations but
I don't care. I would be happy to find ANY comparison at this point if
it was valid and semi scientifically done.

I don't have an answer for you, but I can tell you that most in the C++
community think the benefits of std::string is well worth the cost
(which, as you say, varies between implementations but which is
pragmatically speaking usually "good enough"). In fact, the FAQ for
this group maintains that arrays are evil. Compare:

http://www.parashift.com/c++-faq-lite/exceptions.html#faq-17.5

Cheers! --M
 
R

roberts.noah

mlimber said:
I don't have an answer for you, but I can tell you that most in the C++
community think the benefits of std::string is well worth the cost
(which, as you say, varies between implementations but which is
pragmatically speaking usually "good enough"). In fact, the FAQ for
this group maintains that arrays are evil. Compare:

http://www.parashift.com/c++-faq-lite/exceptions.html#faq-17.5

_I_ know that, but as before I am trying to convince someone that is a
*really* die hard char* fan. Yes, I spent 8 hours yesturday chasing
down buffer overflows caused by char* but every time I use std::string
I get hastled about performance issues. I've shown a lot that
std::string isn't showing up in the profiles I do but it isn't having
the effect I wolud like. I need a benchmark to be at all convincing.
 
P

peter steiner

Are there any decent benchmarks of the difference between using c
strings and std::string out there? Google isn't being friendly about
it. Obviously this would be dependant on different implementations but
I don't care. I would be happy to find ANY comparison at this point if
it was valid and semi scientifically done.

i don't know of any concrete performance benchmarks, which probably
wouldn't make much sense anyways. you can be sure that any decent STL
implementation implements sufficiently optimized string and allocation
operations which would give you a hard time to match in custom
implementations.

the c++ performance report gives hard figures of the (practically
non-existent) performance hit for using non-virtual classes. (see
www.open-std.org/jtc1/sc22/wg21/docs/papers/2002/n1396.pdf)

if you have very special requirements for you string operations that
differ much from the common use case you might investigate implementing
a custom allocator or even string class, but that should not be
necessary in general.

-- peter
 
V

Victor Bazarov

[..]
_I_ know that, but as before I am trying to convince someone that is a
*really* die hard char* fan. [...]
> I need a benchmark to be at all convincing.

No, you don't. Leave it be. Were you paid for the time it took you to
chase that overflow bug? Document that and move on. When your boss asks
you to justify the expense and to suggest improvements, present your
std::string argument then and to him/her, not to char* fans. Things like
this are not worth our time to perpetuate the argument. For any benchmark
proving your point there will be another disproving it. And if there is
no easily available one, any die-hard char* fan will feel challenged to
create such benchmark. That would only result into them wasting their
time on that, after you've already wasted yours.

V
 
M

Matteo

... I've shown a lot that
std::string isn't showing up in the profiles

That should really be enough for him if he's actually concerned about
performance, rather than some odd macho thing. Perhaps char *'s are
faster in some circumstances - say if your application computes strlen
and does nothing else, but since std::strings are much easier to use
safely (w.r.t. the buffer overflows), and have negligible impact in
your code performance, they are easily justifiable.

I would say that in making decisions about which string to use, that
the burden of proof lies on him. Ask him to justify the increased
programmer time and reduced code security for a 0.01% or less speedup
(or whatever the profiler's minimum resolution is). If this is a
decision that needs to be made for some project, I would hope that the
technical lead would find the above argument compelling.

-matt
 
R

Roland Pibinger

I am trying to convince someone that is a
*really* die hard char* fan. Yes, I spent 8 hours yesturday chasing
down buffer overflows caused by char* but every time I use std::string
I get hastled about performance issues. I've shown a lot that
std::string isn't showing up in the profiles I do but it isn't having
the effect I wolud like. I need a benchmark to be at all convincing.

There is no specification of the std::string performance
characteristics ('std::string' means the std::basic_string template).
Thus, e.g. the current Microsoft/Dinkumware SSO/LSP implementation
(Small-String-Optimization, Long-String-Pessimization) has entirely
different performance characteristics than a reference-counted COW
(copy-on-write) implementation. (Sorry for the TLAs.)
Performance aware C++ programmers must adapt their programming style
to their std::string implementation.

Best wishes,
Roland Pibinger
 
C

Calum Grant

Are there any decent benchmarks of the difference between using c
strings and std::string out there? Google isn't being friendly about
it. Obviously this would be dependant on different implementations but
I don't care. I would be happy to find ANY comparison at this point if
it was valid and semi scientifically done.

The time you're really going to notice the difference is when you
allocate a string, e.g.

char str1[1000]; // Why not 1002?
vs
std::string str1;

For the rest e.g. iterating a string or passing a string around by
reference, there will be no difference. Functions like strlen() will be
slower than std::string::size().

You should be aware of std::string::reserve() if you want to improve the
performance of strings. Avoid passing strings by value. e.g.

std::string toUpper(std::string in);

is bad because it has 2 unnecessary copies. Prefer

void toUpper(const std::string &in, std::string &out);

Finally, what is your application? Generally, any GUI, database or
network-bound application isn't going to benefit one iota from premature
optimization. This is why huge websites work fine in Perl.

Algorithms and architectures are far more important in terms of
performance - no application is perfect but concentrating your efforts
where it will make the least impact is a waste of everyone's time. As a
programmer I want to think more about the problem, and less about the
language.

Calum
 
R

Roland Pibinger

Google is not helping me on this one.

It's just the opposite of SSO ;-)
In VC++ > 7.0 "long" strings are copied as 'deep copy' (new memory is
dynamically allocated, the string contents is copied). A short string
contains <= 15 char or <= 7(!) wchar_t. For "long" strings, of course,
a deep copy is much slower than the assignment of a pointer and the
increment of a counter in a COW implementation. Again, using
std::string in a performance-critical implementation means programming
against an implementation, not an interface. e.g.

instead of a function like

std::string getCurrentDirectory(); // ok for COW, not for SSO/LSP

you should use

std::string& getCurrentDirectory (std::string& out);
// current dir is copied into 'out' and returned by reference

Best wishes,
Roland Pibinger
 
V

Victor Bazarov

Thomas said:
LSP is the literal opposite to SSO.

Huh... I can see how "Long" is opposite to "Short" and how
"Pessimization" is opposite to "Optimization", but to be literal,
shouldn't SSO result into something like Long-Point-Pessimization
(or whatever is the literal opposite of "String")? :)
 
T

Thomas J. Gritzan

Victor said:
Huh... I can see how "Long" is opposite to "Short" and how
"Pessimization" is opposite to "Optimization", but to be literal,
shouldn't SSO result into something like Long-Point-Pessimization
....

(or whatever is the literal opposite of "String")? :)

Maybe "array", so: Long-Array-Pessimization. :)

Thomas
 
M

meagar

You do not need any kind of benchmark here. You are working in C++?
Do it the C++ way; you are arguing in favor of the default. If your
char*-fanatic wants to do it old way, the onus is on him to prove it is
better, not you to prove it is worse.

The std::string class is probably among the most fundamentally obvious
and beneficial improvement the STL offers over C. Preferring char*
isn't "reinventing the wheel", it is ignoring the wheel entirely in
favor of running everywhere. Uphill.
 
M

Michiel.Salters

Are there any decent benchmarks of the difference between using c
strings and std::string out there? Google isn't being friendly about
it. Obviously this would be dependant on different implementations but
I don't care. I would be happy to find ANY comparison at this point if
it was valid and semi scientifically done.

Depends a lot on the std::string implementation, but you probably want
to google for Herb Sutter's articles on this. Personally, I've seen an
order
of magnitude (better for std::string). Still, my own string class did
even
better in that case. But the real win was that the change from
std::string
to my::string was a one-minute fix. Going from char* to std::string
took
a complete rewrite. That's the real profit of std::string, if you don't
like
the implementation it's a lot easier to replace. Replacing char* in an
existing program is a bit tricky. It probably involves changing the
compiler ;)

HTH,
Michiel Salters
 
M

Michiel.Salters

Roland said:
(Small-String-Optimization, Long-String-Pessimization)

Wrong. SSO can also be a LSO.

One reason is that SSO avoids trips to the memory allocator for small
strings. This reduces the load on the memory allocator. I'm not aware
of
any allocator which performs worse without these many small
allocations.
However, there are many that do better if you drop all those small
allocations.
For instance, some allocators have issues with fragmentation. The
deallocation of small strings leaves holes, that have to be considered
when
allocating large strings. When these small holes don't exists due to
SSO,
the large string allocation is faster. Multithreaded environments are
another:
less contention means faster allocators.

HTH,
Michiel Salters
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top