String formating! Aarrrrgggh.

Y

Your Name

Hi all,

I run a small engineering company, and our main project is a very large C++
application -- it's actually multiple applications which communicate with
each other over various interfaces (USB, serial, etc).

We're at the point now where we have several developers all working on the
various pieces of C++ code. Some of us are comfortable with Boost, some
use StringStream, and some still use sprintf() and sscanf(). But things
are very inconsistent, and there is too much "glue" code converting between
all the different string representations.

If it matters, our application is largely mathematical, so the bulk of the
string handling that we do is converting floats, ints, and doubles to and
from strings, and we need precise control over the formatting/precision.
Much of this is to/from text files, but some of it is from Win32 objects,
like MFC controls (edit boxes, etc).

I want to choose a standard and convert all of our existing code. The
choices I'm considering are:

1) std::string, with Boost::format and Boost::tokenizer
2) std::string, with StringStream
3) MFC's CString (I don't know the sscanf() equivalent here)
4) The new sprintf_s() and sscanf_s() "secure" functions available in
Visual Studio
5) Writing a custom string class, along with formatting functions
6) Writing custom formatting functions that work with std::string

I'm sure there's no one-size-fits-all answer to this problem, so I'm
looking for general feedback. On the other hand, this is such a
fundamental problem in application development that I'm a little surprised
there's no universally accepted solution. Or maybe there is and I just
didn't get the memo.

Thanks for suggestions and opinions.

Pat
 
J

Jeff Schwab

Your said:
Hi all,

I run a small engineering company, and our main project is a very large C++
application -- it's actually multiple applications which communicate with
each other over various interfaces (USB, serial, etc).

We're at the point now where we have several developers all working on the
various pieces of C++ code. Some of us are comfortable with Boost, some
use StringStream, and some still use sprintf() and sscanf(). But things
are very inconsistent, and there is too much "glue" code converting between
all the different string representations.

If it matters, our application is largely mathematical, so the bulk of the
string handling that we do is converting floats, ints, and doubles to and
from strings, and we need precise control over the formatting/precision.
Much of this is to/from text files, but some of it is from Win32 objects,
like MFC controls (edit boxes, etc).

I want to choose a standard and convert all of our existing code. The
choices I'm considering are:

1) std::string, with Boost::format and Boost::tokenizer
2) std::string, with StringStream

What is StringStream? Do you mean std::stringstream, or is StringStream
a separate library? (Darned if I know how to do case-sensitive Googling...)
3) MFC's CString (I don't know the sscanf() equivalent here)
4) The new sprintf_s() and sscanf_s() "secure" functions available in
Visual Studio
5) Writing a custom string class, along with formatting functions
6) Writing custom formatting functions that work with std::string

I'm sure there's no one-size-fits-all answer to this problem, so I'm
looking for general feedback. On the other hand, this is such a
fundamental problem in application development that I'm a little surprised
there's no universally accepted solution. Or maybe there is and I just
didn't get the memo.

I'm personally a fan of std::string, although you should be aware that
different implementations have chosen wildly different trade-offs (in
terms of space and performance). I certainly would not go with the
MS-specific "secure" functions for general-purpose string manipulation;
they are clearly meant to support C-style strings, which are really
arrays. Arrays have different semantics from other C++ types, and are
incompatible with many generic library components (including the STL
containers).

As far as rolling your own solutions goes, I wouldn't even consider that
unless you need extremely fine-grained control over your string's API
and object layout. On the other hand, you might consider a thin wrapper
class with a private member of some existing string type. I have done
this in the past to define string-like objects, particularly "tokens"
that include parse-related information but are otherwise used like
std::strings.
 
J

Jim Langston

Your said:
Hi all,

I run a small engineering company, and our main project is a very
large C++ application -- it's actually multiple applications which
communicate with each other over various interfaces (USB, serial,
etc).

We're at the point now where we have several developers all working
on the various pieces of C++ code. Some of us are comfortable with
Boost, some use StringStream, and some still use sprintf() and
sscanf(). But things are very inconsistent, and there is too much
"glue" code converting between all the different string
representations.

If it matters, our application is largely mathematical, so the bulk
of the string handling that we do is converting floats, ints, and
doubles to and from strings, and we need precise control over the
formatting/precision. Much of this is to/from text files, but some of
it is from Win32 objects, like MFC controls (edit boxes, etc).

I want to choose a standard and convert all of our existing code. The
choices I'm considering are:

1) std::string, with Boost::format and Boost::tokenizer
2) std::string, with StringStream
3) MFC's CString (I don't know the sscanf() equivalent here)
4) The new sprintf_s() and sscanf_s() "secure" functions available in
Visual Studio
5) Writing a custom string class, along with formatting functions
6) Writing custom formatting functions that work with std::string

I'm sure there's no one-size-fits-all answer to this problem, so I'm
looking for general feedback. On the other hand, this is such a
fundamental problem in application development that I'm a little
surprised there's no universally accepted solution. Or maybe there
is and I just didn't get the memo.

Thanks for suggestions and opinions.

Whichever one you chose, build it into a function or template and have
everyone use it. Personally I use this:

template<typename T, typename F > T StrmConvert( const F from )
{
std::stringstream temp;
temp << from;
T to = T();
temp >> to;
return to;
}

template<typename F> std::string StrmConvert( const F from )
{
return StrmConvert<std::string>( from );
}

which uses stringstream. One of the advantages of this method is you can
convert from anything to anything. Notice there is no error checking in
this, that if something can't be converted a default intiialized value is
returned (0 for int, 0.0 for float, "" for string, etc..).

One of the things about this code, however, is that anything that has <<
defined and >> defined will work. But once you build a function or template
and decide later to do it some different way for whatever reason, you only
have to change the code in one place, as long as everyone uses what you
built.

If you wish to be more strict in the format coming in (std::string instead
of char* also for instance) you can use another layer of encapuslation.

Untested code:

templat<typename T> T ToNumber( const std::string from )
{
return StrmConvert<T> StrmConvert( from );
}

Then make sure everyone uses ToNumber such as:
std::string Number = "12345.678";
float Value = ToNumber<float>( Number );

The question if one is better than the other... the only one I would stay
away from is Microsoft specific CString. As you can probably tell my
preference is std::stringstream but mainly because I don't have to worry
about what I'm converting from and what I'm converting to I let stringstream
handle it for me. I don't have to worry about do I need atoi or atof or
atod or does atod even exist etc...
 
C

Christopher

What is StringStream? Do you mean std::stringstream, or is StringStream
a separate library? (Darned if I know how to do case-sensitive Googling...)



I'm personally a fan of std::string, although you should be aware that
different implementations have chosen wildly different trade-offs (in
terms of space and performance). I certainly would not go with the
MS-specific "secure" functions for general-purpose string manipulation;
they are clearly meant to support C-style strings, which are really
arrays. Arrays have different semantics from other C++ types, and are
incompatible with many generic library components (including the STL
containers).

As far as rolling your own solutions goes, I wouldn't even consider that
unless you need extremely fine-grained control over your string's API
and object layout. On the other hand, you might consider a thin wrapper
class with a private member of some existing string type. I have done
this in the past to define string-like objects, particularly "tokens"
that include parse-related information but are otherwise used like
std::strings.

I've never seen any cons for using std::stringstream for conversion
and after reading some books dedicated to iostreams, it is apparent
that it is specifically designed for your requirements. i.e text
parsing, formatting, and transport.

The only reason, in my limited experience, that I've found other
developers using other methods is that they usually don't know
std:stringstream exists or how to use it.

Adding to Jim's post. To error check, just check the error bit of the
stringsteam after the attempted conversion. You can also enable
exceptions to be thrown from it if you want to go that route.

It has the added bonus of support for localization, custom formatting,
and UDTs with a little work.
 
J

Jerry Coffin

[ ... ]
I've never seen any cons for using std::stringstream for conversion
and after reading some books dedicated to iostreams, it is apparent
that it is specifically designed for your requirements. i.e text
parsing, formatting, and transport.

The only reason, in my limited experience, that I've found other
developers using other methods is that they usually don't know
std:stringstream exists or how to use it.

There are a few cons to using iostreams compared to something like
sprintf.

First of all, detailed formatting with iostreams can get quite verbose.
Second, localization can be substantially more difficult with iostreams.

For an example of the first, figure out what it takes to duplicate even
a relatively simple format string like "%2.2x" using iostreams.

For the second, you need to consider that the formatting using something
like sprintf is purely _data_ from the viewpoint of the compiler -- it
can be read in from an external file, so the literal parts can be
translated to another language without recompiling the program at all.
You do need a positional notation (like POSIX provides) to support
localization though -- the order of phrasing depends on the language,
even with things as simple as writing out numbers in text form.
iostreams make that much more difficult, because the order is hard-coded
into the C++ itself, so changing the order _require_ changing the source
code and recompiling.

Don't get me wrong: I realize iostreams have advantages as well, but
they're not what I'd call a panacea by any means.
 
I

Ivan Vecerina

: I run a small engineering company, and our main project is a very
large C++
: application -- it's actually multiple applications which communicate
with
: each other over various interfaces (USB, serial, etc).
:
: We're at the point now where we have several developers all working on
the
: various pieces of C++ code. Some of us are comfortable with Boost,
some
: use StringStream, and some still use sprintf() and sscanf(). But
things
: are very inconsistent, and there is too much "glue" code converting
between
: all the different string representations.
I imagine you mean "in-memory" representations of the strings (e.g.
using C arrays vs std::string vs yet another class ) ?
[ if it is about how numbers are formatted as text, you first need
to agree on a common text representation ... ]

: If it matters, our application is largely mathematical, so the bulk of
the
: string handling that we do is converting floats, ints, and doubles to
and
: from strings, and we need precise control over the
formatting/precision.
: Much of this is to/from text files, but some of it is from Win32
objects,
: like MFC controls (edit boxes, etc).
:
: I want to choose a standard and convert all of our existing code. The
: choices I'm considering are:
:
: 1) std::string, with Boost::format and Boost::tokenizer
Ok if you need the features, but might be overkill if most of your
strings store a single value (or space-delimited and easily extracted).
: 2) std::string, with StringStream
Ok. But formatting with std::stringstream can be cumbersome.
: 3) MFC's CString (I don't know the sscanf() equivalent here)
Yuck (portability, safety, ...).
: 4) The new sprintf_s() and sscanf_s() "secure" functions available in
: Visual Studio
Yes if you are fine with platform-specific stuff, and want to
train everyone on their use.

: 5) Writing a custom string class, along with formatting functions
: 6) Writing custom formatting functions that work with std::string
Free functions should be preferred to creating a new type, when
all you want is extend the interface of a class
(see the classic http://www.ddj.com/cpp/184401197 )
If you think you can get everyone to standardize on a common class
for string storage, rather pick an existing one - and std::string
is a good default choice.

: I'm sure there's no one-size-fits-all answer to this problem, so I'm
: looking for general feedback. On the other hand, this is such a
: fundamental problem in application development that I'm a little
surprised
: there's no universally accepted solution. Or maybe there is and I
just
: didn't get the memo.

I personally like the flexibility of the printf formatting
(ostringstream
can be quite cumbersome). But a key issue is the risk of buffer
overflow,
and the fact that safe solutions tend to be platform-specific.

The way I approached the problem is to provide a common header
with utility functions for value/string conversions.
Examples could include:
// Equivalent of sprintf, returning a string. Internally implemented
// safely using vsprintf_s or other platform-specific solutions.
std::string stringformat(const char* format, ...);
If some representations (e.g. fixed point float) are very common,
it can be worth to provide safer/simpler functions as well, such as:
std::string FixedStr( double value, int decimals );
//-> does something like stringformat("%.*f",decimals,value)

If you are using MFC, you may also want to add a collection of
functions to set/retrieve values from text fields.
For instance I once used:
void setItemInt ( CWnd* dialog, int itemID, long value );
void setItemFloat( CWnd* dialog, int itemID, double value, int
decimals);
std::string getItemText ( CWnd* dialog, int itemID );
long getItemInt ( CWnd* dialog, int itemID, long
defaultValue );
double getItemFloat( CWnd* dialog, int itemID, double
defaultValue );
// defaultVal is returned if the window's text could not be converted


Anyway, this is just an example.
My general recommendation is to build consensus within your team,
and to translate it into a small shared toolbox for common operations.


I hope this helps --Ivan
 
A

Andy Champ

Jim Langston wrote:
Then make sure everyone uses ToNumber such as:
std::string Number = "12345.678";
float Value = ToNumber<float>( Number );
At this point I'll make a little plug for the proxy object I posted at
the front of that "dynamic_cast is ugly!" thread - which reduces your
syntax to

float Value = ToNumber( Number );

.... assuming the compiler can tell the type of the LHS.

Andy
 
Y

Your Name

Thanks for all the suggestions.

The general opinion seems to be that std::string and std::stringsream are a
good choice, although complex formatting is a little awkward.

So how about this as a compromise... Use std::string and std::stringstream
in most cases, especially where only simple formatting is needed, e.g.:

s << "The value of x is " << x;

In addition, provide a header with formatting functions, so code where we
need more specific formatting would look something like:

s << "The value of x is " << str_format(x, "%05.5g");

This is basically the approach taken by Boost::Format(), but a bit more
lightweight. Also, I'm not crazy about the way Boost::tokenizer works. I
think I'd prefer something more like the "ToNumber" approach suggested by
Jim.

Any obvious drawbacks to an approach like this?
 
J

James Kanze

[ ... ]
I've never seen any cons for using std::stringstream for
conversion and after reading some books dedicated to
iostreams, it is apparent that it is specifically designed
for your requirements. i.e text parsing, formatting, and
transport.
The only reason, in my limited experience, that I've found
other developers using other methods is that they usually
don't know std:stringstream exists or how to use it.
There are a few cons to using iostreams compared to something
like sprintf.
First of all, detailed formatting with iostreams can get quite verbose.

Just a single manipulator. More importantly, with ostream, you
can specify the formatting logically, rather than physically;
this makes the ostream version far more maintainable.
Second, localization can be substantially more difficult with
iostreams.

Substantially, I don't think so. The only real additional
problem they introduce is generating the complete strings in
context so that a translator can see them, instead of seeing
just little bits.
For an example of the first, figure out what it takes to
duplicate even a relatively simple format string like "%2.2x"
using iostreams.

Have you ever seen anyone actually use this feature of printf?
(I don't think that boost::format even supports it. My Format
class did, and I know that it required a lot of extra code to do
so, since ostream ignores the precision when outputting integral
types.)

Of course, in this special case, where the width and the
precision are equal:
dest << HexFmt( 2 ) << value ;
For the second, you need to consider that the formatting using
something like sprintf is purely _data_ from the viewpoint of
the compiler -- it can be read in from an external file, so
the literal parts can be translated to another language
without recompiling the program at all. You do need a
positional notation (like POSIX provides) to support
localization though -- the order of phrasing depends on the
language, even with things as simple as writing out numbers in
text form. iostreams make that much more difficult, because
the order is hard-coded into the C++ itself, so changing the
order _require_ changing the source code and recompiling.

Have you ever actually used this for localization? It doesn't
work in practice. Grammars differ in a lot more than word
order, and you almost always need a separate dynamically linked
component if your application requires grammatically correct
sentences in different languages.

If all you're outputting is short error messages, of course, you
can usually get away with something like:
std::cout << "could not open: " << filename ;
Don't get me wrong: I realize iostreams have advantages as
well, but they're not what I'd call a panacea by any means.

In practice, you have to consider why you're outputting. I find
ostream very good for logging and such, and for output destined
to the printer (output as LaTeX source, of course, for good
formatting). For screen output, today, you generally will have
to output each field separately (to a "field formatter", which
will probably mainly consist of an ostringstream), and insert it
in the model of whatever GUI component is doing the display.
 
M

Marcel Müller

We're at the point now where we have several developers all working on the
various pieces of C++ code. Some of us are comfortable with Boost, some
use StringStream, and some still use sprintf() and sscanf(). But things
are very inconsistent, and there is too much "glue" code converting between
all the different string representations.

If it matters, our application is largely mathematical, so the bulk of the
string handling that we do is converting floats, ints, and doubles to and
from strings, and we need precise control over the formatting/precision.
Much of this is to/from text files, but some of it is from Win32 objects,
like MFC controls (edit boxes, etc).

- Provide global library functions for the task.
- Be as tolerant as possible in the string to number direction.
- Try to use only std::string and const char* in the interfaces between
modules.
I want to choose a standard and convert all of our existing code. The
choices I'm considering are:

1) std::string, with Boost::format and Boost::tokenizer

I never really got used to it. Maybe it is because the boost libraries
do not compile on all of my preferred compilers.
2) std::string, with StringStream

You don't want to do this. It makes the code rather unreadable.
The iostream libraries are nothing what about C++ should be proud of.
Their design is from the very beginning, maybe to demonstrate the
uncommon operator overloading at that time.
3) MFC's CString (I don't know the sscanf() equivalent here)

If you are a boy scout avoid MFC and ATL. This is mostly a good deed,
especially if portabilitiy might play a role in future.
However, in GUI application you do not always have a reasonable choice.
4) The new sprintf_s() and sscanf_s() "secure" functions available in
Visual Studio

Don't know. Never noticed these functions.
5) Writing a custom string class, along with formatting functions

This depends on your needs. If you want the maximum performance it might
be a good choice. Otherweise rather not.
6) Writing custom formatting functions that work with std::string

I would prefer that.

For usual tasks I do these things based on a custom (v)sprintf derivate
that prints into a std::string. This is a reasonable compromise -
without type safty, but with no buffer overruns.
The implementation is based on vsnprintf. Unfortunately with std::string
this requires to copy the result around in memory. Furthermore it gives
undefined behaviour when the parameters are asynchronously modified by
another thread.

In some projects I use a custom string class that provides a raw_init
function that initializes the string to an uninitialized array of given
length. That is the only function that returns a non-const char pointer
and it is vary useful for adaption of C libraries.

The other direction is easier, since sscanf can be safely applied to
std::string by using c_str().

In your case a set of functions that do completely encapsulate the
conversion from an to floating point numbers, vectors, matrices or
whatever might be the best.

I'm sure there's no one-size-fits-all answer to this problem, so I'm
looking for general feedback. On the other hand, this is such a
fundamental problem in application development that I'm a little surprised
there's no universally accepted solution. Or maybe there is and I just
didn't get the memo.

Well, C++ has a long history. Only 10 years ago still some compilers did
not support std::string out of the box.
And some current C++ programmers are not far behind the first letter of
the language's name. It is also a question of habit.

I still use C style strings in simple parsers too. But I only do this
within the scope of at most one function. Beyond that I only use string
classes. Either std::string or sometimes my own string class.


Marcel
 
P

peter koch

We're at the point now where we have several developers all working on the
various pieces of C++ code.  Some of us are comfortable with Boost, some
use StringStream, and some still use sprintf() and sscanf().  But things
are very inconsistent, and there is too much "glue" code converting between
all the different string representations. [snip]
2) std::string, with StringStream

You don't want to do this. It makes the code rather unreadable.
The iostream libraries are nothing what about C++ should be proud of.
Their design is from the very beginning, maybe to demonstrate the
uncommon operator overloading at that time.

I have to disagree with this point. I believe iostream to be easy to
read. If there is some output that needs many formatters (e.g. Jerry
Coffins %2.2x in another post), it is quite easy to write your own
formatter to do the job.
The solution also adopts well for output of your own classes, where
you just write your own stream-operators.
[snip]
This depends on your needs. If you want the maximum performance it might
be a good choice. Otherweise rather not.


I would prefer that.

For usual tasks I do these things based on a custom (v)sprintf derivate
that prints into a std::string. This is a reasonable compromise -
without type safty, but with no buffer overruns.
The implementation is based on vsnprintf. Unfortunately with std::string
this requires to copy the result around in memory. Furthermore it gives
undefined behaviour when the parameters are asynchronously modified by
another thread.

In some projects I use a custom string class that provides a raw_init
function that initializes the string to an uninitialized array of given
length. That is the only function that returns a non-const char pointer
and it is vary useful for adaption of C libraries.

The other direction is easier, since sscanf can be safely applied to
std::string by using c_str().

In your case a set of functions that do completely encapsulate the
conversion from an to floating point numbers, vectors, matrices or
whatever might be the best.
This is a bad solution IMHO. You revert to a type-unsafe solution that
does not adapt and does not support class-type parameters. I see no
advantages whatsoever for that approach.

/Peter
 
J

Jerry Coffin

[ ... ]
Have you ever seen anyone actually use this feature of printf?

Yes. The whole time I look in the mirror shaving in the morning, I spend
time looking at one person who's used it, and run into situations where
its not working in C++ caused a real problem, to the point of writing
code to use sprintf instead, because simply made a simple job stupidly
difficult.
(I don't think that boost::format even supports it. My Format
class did, and I know that it required a lot of extra code to do
so, since ostream ignores the precision when outputting integral
types.)

Yes, a lousy idea that.

[ ... positional arguments ]
Have you ever actually used this for localization? It doesn't
work in practice. Grammars differ in a lot more than word
order, and you almost always need a separate dynamically linked
component if your application requires grammatically correct
sentences in different languages.

Yes, I've used it. While it doesn't work particularly well, at least it
works poorly -- and working poorly puts it several steps above iostreams
in that respect.
 
J

James Kanze

[ ... ]
Have you ever seen anyone actually use this feature of printf?
Yes. The whole time I look in the mirror shaving in the
morning, I spend time looking at one person who's used it, and
run into situations where its not working in C++ caused a real
problem, to the point of writing code to use sprintf instead,
because simply made a simple job stupidly difficult.

You must be an exception. I was too, when I used printf. But
from experience, the number of people who are aware of what
precision does for integer output is very, very small.
Yes, a lousy idea that.

What, ignoring precision when outputting integral types in
ostream, or implementing support for it in my Format class. (Or
both:).)
[ ... positional arguments ]
Have you ever actually used this for localization? It
doesn't work in practice. Grammars differ in a lot more
than word order, and you almost always need a separate
dynamically linked component if your application requires
grammatically correct sentences in different languages.
Yes, I've used it. While it doesn't work particularly well, at
least it works poorly -- and working poorly puts it several
steps above iostreams in that respect.

I suppose it depends on the environment, but every time I've had
to do internationalized output, it hasn't worked sufficiently
well for me to be able to use it. In every case, I've needed
some custom code for different languages, in a dynamically
loaded object. (Note too that it has always been very, very
difficult---if the original code didn't make allowances for
gender, for example, or for number on adjectives, it is very
hard to add them later, regardless. Using printf or ostream
doesn't really change much here.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top