Convert an integer to a string? Plan B?

  • Thread starter Steven T. Hatton
  • Start date
A

andy

Roland said:
Here's a proposal for a simple, efficient, and safe conversion
function:

#include <stdio.h>
#include <string>

inline std::string& itostr (int i, std::string& out) {
std::string::value_type buf[128];
int len = snprintf(buf, sizeof(buf), "%d", i);

if (len > 0 && size_t (len) < sizeof (buf)) {
out.assign(buf, std::string::size_type (len));
} else {
out.clear();
}
return out;
}

Seems good to me. I reckon you could maybe improve on the buffer size
could you as below? Whether the function should assert or throw an
exception on failure is another debate too of course, and whether to
check for the terminator etc :

#include <cstdio>
#include <limits>
#include <string>
#include <stdexcept>

struct itostr_error : public std::exception{
itostr_error(){}
const char* what(){ return "bad itostr";}
};
inline
std::string& itostr (int i, std::string& out)
{
std::string::value_type buf[std::numeric_limits<int>::digits10 +
3];
int len = std::sprintf(buf, "%d", i); //slightly quicker
if (len > 0 && size_t (len) < sizeof (buf)
&& (buf[std::numeric_limits<int>::digits10 + 2]=='\0')) {
out.assign(buf, std::string::size_type (len));
return out;
} else {
throw itostr_error();
}
}

#include <iostream>
int main()
{
try{
std::string str;
itostr(INT_MIN,str);
std::cout << str <<'\n';
}
catch( itostr_error & e){
std::cout << e.what() <<'\n';
}
}

regards
Andy Little
 
P

Pete Becker

Whether the function should assert or throw an
exception on failure is another debate too

No, it's not. <g> It's a matter of looking at the design of the
application it's going to be used in.

But in general, low level functions should report errors in-channel,
because once you've added the overhead of throwing exceptions you can't
get rid of it. Applications that don't use exceptions shouldn't have to
pay for them. So this sort of thing ought to be written in two layers,
one to do the formatting and report errors, and one that calls it and
translates errors into exceptions.

char *itoc(char *buf, size_t size_if_you_insist, int val)
{
format val into buf
if successful, return next position in buf, otherwise return 0
}

string itostr(int val)
{
char buf[whatever];
if (itoc(buf, whatever, val))
return string(buf);
throw error;
}
 
A

andy

(e-mail address removed) wrote:

[...]
struct itostr_error : public std::exception{
itostr_error(){}
const char* what(){ return "bad itostr";}
};
inline
std::string& itostr (int i, std::string& out)
{
std::string::value_type buf[std::numeric_limits<int>::digits10 +
3];
int len = std::sprintf(buf, "%d", i); //slightly quicker
if (len > 0 && size_t (len) < sizeof (buf)
&& (buf[std::numeric_limits<int>::digits10 + 2]=='\0')) {

BTW The last line above now looks decidedly dodgy, but I dont know the
spec well enough to know if its a potential error, though it slightly
surprisingly doesnt seem to throw an exception on input of e.g 1 on
two systems. Might be best to change it to: &&(buf[len]=='\0')){

regards
Andy Little
 
A

andy

Pete said:
No, it's not. <g> It's a matter of looking at the design of the
application it's going to be used in.

But in general, low level functions should report errors in-channel,
because once you've added the overhead of throwing exceptions you can't
get rid of it. Applications that don't use exceptions shouldn't have to
pay for them.

I put them in because I do use them. If an empty string is returned
that is an error and must be acknowledged as such rather than ignored,
which is what happens IMO all to often if it can be ignored.
Embarassing as it is is users must be informed when output is invalid
even if in worst case the application shuts down to indicate its
mangled their data.

(The same is true in critical embedded systems AFAIK except that the
error must be dealt with in place or the alarm sounds and everybody
runs for it, nevertheless in all cases it must be made impossible to
ignore corrupted data)

So this sort of thing ought to be written in two layers,
one to do the formatting and report errors, and one that calls it and
translates errors into exceptions.

char *itoc(char *buf, size_t size_if_you_insist, int val)
{
format val into buf
if successful, return next position in buf, otherwise return 0

^^^^^^^^^^^^^^^^^^^^^^^^^

What are we doing here? ... iterating char by char ... seems a little
slow, hardly inlinable surely? ... hmm what about using a stringstream
?

;-)

regards
Andy Little
 
R

Roland Pibinger

I put them in because I do use them. If an empty string is returned
that is an error and must be acknowledged as such rather than ignored,
which is what happens IMO all to often if it can be ignored.

Actually, the conversion from int to string cannot fail because all
bit combinations in an int result in a valid value. The only
'exception' is an out-of-memory condition but that's a different
question (you cannot handle OOM with exceptions). The error handling
code is only included to acknowledge the return value of snprintf.
BTW, the conversion functions are of course not symmetric (string to
int can fail).

Best wishes,
Roland Pibinger
 
R

Roland Pibinger

inline
std::string& itostr (int i, std::string& out)
{
std::string::value_type buf[std::numeric_limits<int>::digits10 +
3];
int len = std::sprintf(buf, "%d", i); //slightly quicker
if (len > 0 && size_t (len) < sizeof (buf)
&& (buf[std::numeric_limits<int>::digits10 + 2]=='\0')) {

What means std::numeric_limits<int>::digits10 and why is it 9 on a 32
bit system?

Roland Pibinger
 
P

Pete Becker

^^^^^^^^^^^^^^^^^^^^^^^^^

What are we doing here? ... iterating char by char

No, formatting the text into a buffer and returning the _next_ position
in the buffer, i.e. the one after the formatted text. That makes it easy
to continue to append to the buffer.
 
A

andy

Roland said:
inline
std::string& itostr (int i, std::string& out)
{
std::string::value_type buf[std::numeric_limits<int>::digits10 +
3];
int len = std::sprintf(buf, "%d", i); //slightly quicker
if (len > 0 && size_t (len) < sizeof (buf)
&& (buf[std::numeric_limits<int>::digits10 + 2]=='\0')) {

What means std::numeric_limits<int>::digits10 and why is it 9 on a 32
bit system?

It means that you will get a maximum of digits10 + 1 digits for the
type. Don't ask me why its +1. Try this program to make it clearer.
Just count the number of digits shown by each number and compare that
with digits10 value + 1


#include <limits>
#include <iostream>

template <typename T>
void func()
{
std::cout << std::numeric_limits<T>::digits10 << '\n';
std::cout << std::numeric_limits<T>::max() << '\n';
}

int main()
{

func<short>();
func<unsigned short>();
func<int>();
func<unsigned int>();
func<long>();
func<unsigned long>();
func<long long>();
func<unsigned long long>();
}

regards
Andy Little
 
A

andy

Roland said:
Actually, the conversion from int to string cannot fail because all
bit combinations in an int result in a valid value. The only
'exception' is an out-of-memory condition but that's a different
question (you cannot handle OOM with exceptions). The error handling
code is only included to acknowledge the return value of snprintf.

In that case things are much simpler. Now I added a traits class for
the format string so potentially extending the useage to other integer
types, though it seems sprintf is a bit limited to signed and unsigned
int only..whatever...

#include <cstdio>
#include <limits>
#include <string>
#include <boost/utility/enable_if.hpp>
#include <boost/type_traits/is_integral.hpp>

template <typename IntegerType>
struct format;

template <>
struct format<int>{
static const char* specifier(){return "%d";}
};
template <>
struct format<unsigned int>{
static const char* specifier(){return "%u";}
};

template <typename IntegerType>
inline
typename boost::enable_if<
boost::is_integral said:
itostr (IntegerType i, std::string& out)
{
std::string::value_type buf[
std::numeric_limits<IntegerType>::digits10 + 3
];
std::string::size_type len
= std::sprintf(buf, format<IntegerType>::specifier(), i);
out.assign(buf,len);
return out;
}

#include <iostream>
int main()
{
std::string str;
itostr(-1,str);
std::cout << str <<'\n';
itostr(1U,str);
std::cout << str <<'\n';
}

regards
Andy Little
 
K

Kai-Uwe Bux

Roland said:
Actually, the conversion from int to string cannot fail because all
bit combinations in an int result in a valid value.
[snip]

Clause [3.9.1/1] makes that guarantee for unsigned character types. For the
type int, however, I think no such guarantee is made in the standard.


Best

Kai-Uwe Bux
 
D

Daniel T.

Here's a proposal for a simple, efficient, and safe conversion
function:

#include <stdio.h>
#include <string>

inline std::string& itostr (int i, std::string& out) {
std::string::value_type buf[128];
int len = snprintf(buf, sizeof(buf), "%d", i);

if (len > 0 && size_t (len) < sizeof (buf)) {
out.assign(buf, std::string::size_type (len));
} else {
out.clear();
}
return out;
}

Why put a 128 char buffer on the stack when string already has one
imbedded in it...

string& itostr( int i, string& result )
{
result.clear();
if ( i == INT_MIN ) {
result += "-2147483648";
}
else if ( i == 0 ) {
result = '0';
}
else {
string::size_type pos = 0;
if ( i < 0 ) {
result = '-';
i = -i;
pos = 1;
}
while ( i > 0 ) {
result.insert( pos, 1, char( '0' + i % 10 ) );
i /= 10;
}
}
return result;
}

string itostr( int i ) {
string result;
itostr( i, result );
return result;
}
 
A

andy

Daniel said:
Daniel T. wrote:
So, what is the "appropriate test" to ensure that a buffer is not
overrun?

Here's a proposal for a simple, efficient, and safe conversion
function:

#include <stdio.h>
#include <string>

inline std::string& itostr (int i, std::string& out) {
std::string::value_type buf[128];
int len = snprintf(buf, sizeof(buf), "%d", i);

if (len > 0 && size_t (len) < sizeof (buf)) {
out.assign(buf, std::string::size_type (len));
} else {
out.clear();
}
return out;
}

Why put a 128 char buffer on the stack when string already has one
imbedded in it...

string& itostr( int i, string& result )
{
result.clear();
if ( i == INT_MIN ) {
result += "-2147483648";
}
else if ( i == 0 ) {
result = '0';
}
else {
string::size_type pos = 0;
if ( i < 0 ) {
result = '-';
i = -i;
pos = 1;
}
while ( i > 0 ) {
result.insert( pos, 1, char( '0' + i % 10 ) );
i /= 10;
}
}
return result;
}

string itostr( int i ) {
string result;
itostr( i, result );
return result;
}

Hmm ... Maybe somebody ought to test all these functions and see which
has the best features on grounds of speed, stack/heap used and also
reliability (IOW likelihood of failure or data corruption). Maybe even
the original stringstream wouldnt do too bad then ?

Then it should be not too difficult to write a standardisation proposal
.. I guess it has to be called itostr though!

regards
Andy Little
 
R

Roland Pibinger

In that case things are much simpler. Now I added a traits class for
the format string so potentially extending the useage to other integer
types, though it seems sprintf is a bit limited to signed and unsigned
int only..whatever...

#include <cstdio>
#include <limits>
#include <string>
#include <boost/utility/enable_if.hpp>
#include <boost/type_traits/is_integral.hpp>
....

Ahem, wasn't a simple solution desired?
 
A

andy

Roland said:
Andy Little wrote
...

Ahem, wasn't a simple solution desired?

I like to use enable_if. It improves (IMO) the error message if I
passed a double rather than a int for example. So it makes my life a
bit simpler. However its not in the C++ standard I guess Try code
below re that ----->

BTW It would also be useful to template param the string char_type then
wrap sprintf/swprintf in a functor and select based on the char_type.
Then the function would be even simpler wouldnt it .... ;-)

regards
Andy Little

-----------------------
#include <cstdio>
#include <limits>
#include <string>

// comment/uncomment to check difference in error messages
#define USE_ENABLE_IF

#ifdef USE_ENABLE_IF
#include <boost/utility/enable_if.hpp>
#include <boost/type_traits/is_integral.hpp>
#endif

template <typename IntegerType>
struct format;

template <>
struct format<int>{
static const char* specifier(){return "%d";}
};
template <>
struct format<unsigned int>{
static const char* specifier(){return "%u";}
};

template <typename IntegerType>
inline
#ifdef USE_ENABLE_IF
typename boost::enable_if<
boost::is_integral said:
#else
std::string&
#endif
itostr (IntegerType i, std::string& out)
{
std::string::value_type buf[
std::numeric_limits<IntegerType>::digits10 + 3
];
std::string::size_type len
= std::sprintf(buf, format<IntegerType>::specifier(), i);
out.assign(buf,len);
return out;
}

#include <iostream>
int main()
{
std::string str;
itostr(1.,str);
}

regards
Andy Little
 
R

Roland Pibinger

Why put a 128 char buffer on the stack when string already has one
imbedded in it...

because it costs nothing. But I probably change it to
std::string::value_type buf[3 * sizeof(int) + 1];
string& itostr( int i, string& result )
{
result.clear();

this may delete the internal buffer for some string implementations
if ( i == INT_MIN ) {
result += "-2147483648";
}
else if ( i == 0 ) {
result = '0';
}
else {
string::size_type pos = 0;
if ( i < 0 ) {
result = '-';
i = -i;
pos = 1;
}
while ( i > 0 ) {
result.insert( pos, 1, char( '0' + i % 10 ) );

this may cause string reallocations for some string implementations
(IIRC even some professional implementations)
i /= 10;
}
}
return result;
}

With the *printf functions you can format the string output. An
extended version of the above function could offer output format
alternatives to users. Not as format string but in a safe way.

Best wishes,
Roland Pibinger
 
P

Pete Becker

Daniel T. wrote:
[...]

result += "-2147483648";
}


BTW Thats not portable of course. You might be able to use BOOST
Preprocessor to stringize INT_MIN.:

You're looking for a string that converts to the minimum representable
value?

#define XSTR(x) #x
#define STR(x) XSTR(x)
STR(INT_MIN)

But note that on some implementations, INT_MIN is defined more like
(-2147483657-1).
 
A

andy

Pete said:
Daniel T. wrote:
[...]

result += "-2147483648";
}


BTW Thats not portable of course. You might be able to use BOOST
Preprocessor to stringize INT_MIN.:

You're looking for a string that converts to the minimum representable
value?

#define XSTR(x) #x
#define STR(x) XSTR(x)
STR(INT_MIN)

But note that on some implementations, INT_MIN is defined more like
(-2147483657-1).

hmm..IIRC Its you that started us on this rocky road, further up this
thread. Now look where its ended up using nested macros!

OK How about this... just to initalise that value:

#include <sstream>
#include <iostream>
#include <string>

std::string
int_min_init()
{
std::eek:stringstream s;
s << INT_MIN;
return s.str();
}
std::string const & int_min()
{
static std::string const& str = int_min_init();
return str;
}

// now use
result += int_min();

See .. I can still get an ostringstream in somewhere... :)

regards
Andy Little
 
D

Daniel T.

Why put a 128 char buffer on the stack when string already has one
imbedded in it...

because it costs nothing. But I probably change it to
std::string::value_type buf[3 * sizeof(int) + 1];
string& itostr( int i, string& result )
{
result.clear();

this may delete the internal buffer for some string implementations

Then change the line to:
result.reserve( numeric_limits<float>::digits10 + 2 );

or some such. But then you have a problem with "result = ..." calls
below because they *may* reduce the size of the internal buffer thus
defeating the reserve call.
this may cause string reallocations for some string implementations
(IIRC even some professional implementations)


With the *printf functions you can format the string output. An
extended version of the above function could offer output format
alternatives to users. Not as format string but in a safe way.

But then your right back to using something very much like stringstream
which Mr. Becker (and apparently only Mr Becker) finds "too expensive".
 
D

Daniel T.

Hmm ... Maybe somebody ought to test all these functions and see which
has the best features on grounds of speed, stack/heap used and also
reliability (IOW likelihood of failure or data corruption). Maybe even
the original stringstream wouldnt do too bad then ?

Then it should be not too difficult to write a standardisation proposal
. I guess it has to be called itostr though!

I don't think itostr is a good name. Personally, I like
lexical_cast<Type>. As in:

template < typename T, typename U >
T lexical_cast( const U& u ) {
std::stringstream ss;
T t;
if ( !( ss << u && ss >> t ) ) throw std::bad_cast();
return t;
}

template < >
std::string lexical_cast<std::string>( const int& u ) {
// do whatever you think is "least expensive" here Pete.
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,197
Latest member
Sean29G025

Latest Threads

Top