Converting strings to numbers - a question of speed

Z

zexpe

I have an extremely cpu/data intensive piece of code that makes heavy
use of the following function:

void convertToDouble(const std::string& in, double& out)
{
out = atof(in.c_str());
}

I would really like to get away from using any old C-style functions.
So, I modified the above function to make it follow the C++ convention:

void convertToDouble(const std::string& in, double& out)
{
std::stringstream ss(in);
ss >> out;
}

However, my test code that previously took 30s to run, now takes 45s
(linux gcc 4.0.2 with -03 option). That's a heavy price to pay for
adopting the C++ convention. So, my question is... Is there an
*efficient* way to convert a string to a number (e.g. double) without
requiring the use of old C libraries?
 
J

John Harrison

zexpe said:
I have an extremely cpu/data intensive piece of code that makes heavy
use of the following function:

void convertToDouble(const std::string& in, double& out)
{
out = atof(in.c_str());
}

I would really like to get away from using any old C-style functions.
So, I modified the above function to make it follow the C++ convention:

void convertToDouble(const std::string& in, double& out)
{
std::stringstream ss(in);
ss >> out;
}

However, my test code that previously took 30s to run, now takes 45s
(linux gcc 4.0.2 with -03 option). That's a heavy price to pay for
adopting the C++ convention. So, my question is... Is there an
*efficient* way to convert a string to a number (e.g. double) without
requiring the use of old C libraries?

C is legal C++ (mostly). If the C version works better, why not use it?
C libraries aren't old, they're part of modern C++.

john
 
P

Puppet_Sock

zexpe said:
I have an extremely cpu/data intensive piece of code that makes heavy
use of the following function:

void convertToDouble(const std::string& in, double& out)
{
out = atof(in.c_str());
}

I would really like to get away from using any old C-style functions.
So, I modified the above function to make it follow the C++ convention:

void convertToDouble(const std::string& in, double& out)
{
std::stringstream ss(in);
ss >> out;
}

However, my test code that previously took 30s to run, now takes 45s
(linux gcc 4.0.2 with -03 option). That's a heavy price to pay for
adopting the C++ convention. So, my question is... Is there an
*efficient* way to convert a string to a number (e.g. double) without
requiring the use of old C libraries?

Hey, you could roll your own. <evil grin>

But seriously, what was the objection to using atof()? It's part of the
language for a reason. stringstream does plenty of stuff other than
just what atof does, so you get to pay for the overhead.

Also, you might be able to speed up some stuff about your stringstream
use, depending on exactly what your compiler is doing. For example,
what about making your variable ss static? You might check a few
examples of ways of doing that. Maybe the ctor call for strinstream is
where a lot of the extra time is going. But measure it before you
decide.
You also need to be more aware of initializing it as required on each
call instead of depending on the ctor to prepare it for you.
Socks
 
D

deane_gavin

John said:
zexpe wrote:


C is legal C++ (mostly). If the C version works better, why not use it?
C libraries aren't old, they're part of modern C++.

I may be wrong, but doesn't atof suffer from the same problem as atoi -
namely that it's impossible to distinguish correctly converting zero
from an error case. I believe strtod would be a better C library
solution.

Gavin Deane
 
G

Greg

Puppet_Sock said:
Hey, you could roll your own. <evil grin>

But seriously, what was the objection to using atof()? It's part of the
language for a reason. stringstream does plenty of stuff other than
just what atof does, so you get to pay for the overhead.

atof() has been "deprecated" which means that it is on the way out of
the standard and should not be used in new code. It is also neither
thread-safe nor async canceable on most systems.

strtod() is the recommended replacement. Instead of calling atof like
this

#include <cstdlib>
char * nptr; // pointer to string of the number
...

double num = std::atof(nptr);

just replace it with a call to strtod():

double num = std::strtod(nptr, NULL);

Greg
 
Z

zexpe

Puppet_Sock said:
Also, you might be able to speed up some stuff about your stringstream
use, depending on exactly what your compiler is doing. For example,
what about making your variable ss static? You might check a few
examples of ways of doing that. Maybe the ctor call for strinstream is
where a lot of the extra time is going. But measure it before you
decide.
You also need to be more aware of initializing it as required on each
call instead of depending on the ctor to prepare it for you.

Yep. If ss were static, you'd need to clear the stream each time. Can't
see it helping.

Thanks,
Ross
 
Z

zexpe

Greg said:
just replace it with a call to strtod():

double num = std::strtod(nptr, NULL);

Thanks for the tip. I'll give it a go.

My main objection to the C-way, is having to use the old <cstdlib>
library, when I'm beginning to view C++ as a complete replacement for
C. I'd rather work with C++ objects, e.g. strings, and never have to
worry about pointers, char strings etc. forever more! However, I'm
beginning to see that C++ alternatives are not always as fast as the
original C versions!

Ross
 
W

Walter Bright

zexpe said:
So, my question is... Is there an
*efficient* way to convert a string to a number (e.g. double) without
requiring the use of old C libraries?

Doing string to double conversion is a non-trivial, tricky process. A lot of
effort has typically gone into making the C versions (atof, strtod, ecvt,
etc.) fast and accurate. All the C++ conversions I've seen eventually wound
up being shells around the C code.

The non-C features of C++ bring nothing to the table as far as implementing
a fast, accurate string to double conversion routine, so there's no reason
to reimplement the C versions. Even the D programming language relies on
shelling the C float conversion routines.

That said, if you or anyone else can figure out a significantly better way
to do it, I'm certainly interested.

-Walter Bright
www.digitalmars.com C, C++, D programming language compilers
 
W

wittempj

Well - it uses std::istringstream (specialised for reading) and the
function is inline, although I didn't do myself a measurement these two
differences are supposed to increase performance. But the stringstream
approach will give an overhead comparted to using a function like
mentioned on the other posts
 
J

John Harrison

zexpe said:
Thanks for the tip. I'll give it a go.

My main objection to the C-way, is having to use the old <cstdlib>
library, when I'm beginning to view C++ as a complete replacement for
C. I'd rather work with C++ objects, e.g. strings, and never have to
worry about pointers, char strings etc. forever more! However, I'm
beginning to see that C++ alternatives are not always as fast as the
original C versions!

Ross

It's very likely that the C++ library (strings, vectors etc.) uses the C
library internally. There's nothing wrong with you doing the same. Use
pointers but hide them in classes where they can't do any damage to the
rest of your program.

john
 
P

Pete Becker

Greg said:
atof() has been "deprecated" which means that it is on the way out of
the standard and should not be used in new code. It is also neither
thread-safe nor async canceable on most systems.

atof has not been deprecated by the standards committee. One compiler
vendor has decided to discourage the use of atof, among several others.
The merits of this decision are unclear.
 
G

Greg Comeau

I have an extremely cpu/data intensive piece of code that makes heavy
use of the following function:

void convertToDouble(const std::string& in, double& out)
{
out = atof(in.c_str());
}

I would really like to get away from using any old C-style functions.
So, I modified the above function to make it follow the C++ convention:

void convertToDouble(const std::string& in, double& out)
{
std::stringstream ss(in);
ss >> out;
}

However, my test code that previously took 30s to run, now takes 45s
(linux gcc 4.0.2 with -03 option). That's a heavy price to pay for
adopting the C++ convention. So, my question is... Is there an
*efficient* way to convert a string to a number (e.g. double) without
requiring the use of old C libraries?

It may be that you really do need to make this task faster,
but I'd for one firstly be concerned about the safety of it
even working at all. For instance, using atof() is not the
preferred function. Anyway, have a look at some of the choices at

http://www.comeaucomputing.com/techtalk/#atoi

I have not done any timing comparisions, but some combo may
be able to meet safety and efficiency satisfactorily.
 
Z

zexpe

Well - it uses std::istringstream (specialised for reading) and the
function is inline, although I didn't do myself a measurement these two
differences are supposed to increase performance. But the stringstream
approach will give an overhead comparted to using a function like
mentioned on the other posts

Thanks for your suggestion. Although explicitly declaring the function
inline (in my implementation it was already implicitly inline) and
using std::istringstream did improve the performance it was by just a
mere 0.3s of the 15s I lost by switching to a stringstream
implementation.

I guess the point I'm trying to make is that it is a pity many C++
guides, such as the C++ FAQ, boldly encourage us to use C++
methodologies in favour of the traditional C styles, without always
mentioning the performance penalty. I understand this penalty is
negligible in the majority of scenarios, but for the few such as mine,
it would be helpful if the performance penalties were more clearly
highlighted. Especially in those guides that also make a point of
claiming that C++ can still be as efficient as C! ;-)

Thanks,
Ross
 
Z

zexpe

Greg said:
It may be that you really do need to make this task faster,
but I'd for one firstly be concerned about the safety of it
even working at all. For instance, using atof() is not the
preferred function. Anyway, have a look at some of the choices at

http://www.comeaucomputing.com/techtalk/#atoi

I have not done any timing comparisions, but some combo may
be able to meet safety and efficiency satisfactorily.

Thanks for the pointer. That's a very thorough discussion, albeit with
no mention of performance differences. My code typically takes 6 hours
to execute, so the difference between 6 and 9 hours execution time is
enough to force me away from the stringstream method!

I'll be using strtod() from now on. It's just as fast, and safer. For
the record here's the relative computation times of each method for my
test case, scaled to the atof() benchmark:

atof() 1.00
strtod() 1.00
istringstream 1.54
stringstream 1.56

Cheers,
Ross
 
Z

zexpe

zexpe said:
atof() 1.00
strtod() 1.00
istringstream 1.54
stringstream 1.56

Just tried the boost::lexical_cast<double>() method too, and it's even
slower:

boost::lexical_cast<double>() 1.79

There's a price to paid for simplicity!

Ross
 
G

Greg Comeau

Just tried the boost::lexical_cast<double>() method too, and it's even
slower:

boost::lexical_cast<double>() 1.79

There's a price to paid for simplicity!

I'm not convinced of that, but then I'm not unconvinced either :)

BTW, since this one operation is so important to you (why again?),
I'm surprised you are not pursuing tweaks of the different ways
and/or even trying to get strtod faster (if that is possible).
I mean, assuming 1.0 truly is the current best, since it matters
so much, why settle for it?
 
R

Roland Pibinger

Doing string to double conversion is a non-trivial, tricky process. A lot of
effort has typically gone into making the C versions (atof, strtod, ecvt,
etc.) fast and accurate. All the C++ conversions I've seen eventually wound
up being shells around the C code.

The non-C features of C++ bring nothing to the table as far as implementing
a fast, accurate string to double conversion routine, so there's no reason
to reimplement the C versions. Even the D programming language relies on
shelling the C float conversion routines.

Which interface would you prefer for a C++ 'shell' str2long which is:
- convenient
- does appropriate error handling?

a function interface? E.g.

long str2long (const std::string& str) throw (std::runtime_error);//
(1)
long str2long (const std::string& str) throw(); // (2)
bool str2long (const std::string& str, long& out) throw(); // (3)
std::pair<bool,long> str2long (const std::string& str) throw();// (4)
....?

(1) throws exception in case of conversion error
(2) returns default value (== 0) in case of error
(3) returns true/false in case of error and result in out-parameter
(4) returns true/false in .first, result in .second

(2) cannot distinguish if return 0 indicates an error or not; Is
therfore an additional isLong (const std::string& str) function
necessary (a la IsNumeric())?

Or do you prefer a class interface? E.g.

class Converter {
public:
long str2long (const std::string& str) throw();
bool isOk() throw(); // (5)
// ...
};

(5) true if previous conversion has been ok

Best regards,
Roland Pibinger
 
R

Roland Pibinger

I'll be using strtod() from now on. It's just as fast, and safer. For
the record here's the relative computation times of each method for my
test case, scaled to the atof() benchmark:

atof() 1.00
strtod() 1.00
istringstream 1.54
stringstream 1.56

The stringstream times are very fast. Do you reuse the same
stringstream object all the time? Or do you use a Standard library
which implements SSO for std::string?

Best wishes,
Roland Pibinger
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,434
Messages
2,571,691
Members
48,796
Latest member
Greg L.

Latest Threads

Top