I would expect a good optimising compiler to produce faster code in the
cout case, while a less optimising compiler might produce faster code in
the printf case (by not optimising away that function call overhead,
assuming that the printf function was implemented reasonably...)
Actually, that was my assumption way back before I started to implement my
own standard C++ library. As it turns out, it is pretty tough to arrive at
the same speed as the stdio-family with standard IOStreams: you need to go
through a whole bunch of tricks to avoid overheads. This concerns things
like caching result of the ctype and numpunct facets, short-circuiting
sentry construction, folding flags and conditions, etc. Effectively, this
results in roughly the same performance as stdio (if anybody has results
about current implementations, I would be quite interested, especially if
the IOStreams library shows far better performance than stdio). I'm not
sure whether such enhancements are applied to other library implementations
(maybe the other library implementers can say something on this issue...).
Note: I am not a C++ compiler author, or an optimising compiler of any
flavour author, in fact by the time I've written this someone who is
has probably answered anyway and disagreed with me.
I don't think that the optimising compiler can do much here although there
will be a *huge* performance difference between optimized and unoptimized
IOStreams. With IOStreams and locales being templates often located in
header files, optimized compilation will take quite a while at least with
the compilers I tested it (mostly gcc and Sun CC). However, many of the
needed performance gains are in the library implementation.
BTW, you can download my implementation form
<
http://www.dietmar-kuehl.de/cxxrt/> if you want to look at it.