std::ios::binary?

Steven T. Hatton · Jul 30, 2005

Dietmar said:
Steven said:

This is what I wrote:

Click to expand...

I know what you wrote.

copy(file_buf.eback(), file_buf.egptr(), back_inserter(data));

I'm not sure why you say that is fundamentally wrong.

Click to expand...

Have you considered the possibility that it is fundamtentally wrong?
First of all, you are dealing with stream buffer internals which you
are only supposed to access when implementing stream buffers, not
when using them. Second, as I said before, the sequence
[eback(), gptr()) consists of characters which have already been read.
Finally, the sequence [gptr(), egptr()) is actually an internal of the
stream buffer consisting essentially of a portion of the file. This is
a level of abstraction you don't want to operate on when using a stream.
It is the right level of abstraction, however, for reading from an
external source.

Recall that when I described the objective that I said the fact that I
needed to inherit from the stream buffer seemed wrong. I was only asking
about the range of data I specified. Nothing more.

Say what you mean, mean what you say! Even if the assumption may be true
at the beginning of the file, it is source obfuscation at the best to
use the wrong pointer which may happen by accident to be identical to
the correct pointer at some point.

All I had intended by my example was to indicate the range of data I wanted
access to. One of the more difficult things about using the STL model is
that the association between iterators, and the objects they are designed
to work in conjunction with is not appearant from looking at the object.
This is one of the advantages to object oriented programming. The
functions that have access to member data are present in the class
interface. In the case of std::streambuf<> I cannot simply look at the
interface and see that std::istreambuf_iterator<> is available in the
library. The problem represents the downside of loose coupling. In Java,
iterators are typically provided by a factory method of the container.
This is somewhat similar to the STL containers that provide iterator
typedefs as well as begin() and end() as part of their definition.

In the US we have a traditional children's story called _Hansel and Gretel_,
about two children who went for walk in the woods. They were afraid they
would get lost so the left a trail of breadcrumbs. We often use that as a
metaphor for something done intentionally so that a person can follow an
otherwise obscure path. That's what having the iterator in the interface
of the collection provides, a trail of breadcrumbs that tells you there is
an iterator available for that class.

Now consider a typical STL algorithm. You cannot look at an STL container
and see which algorithms are associated with it. Of course, once you
become familiar with the library that isn't that much of a problem. The
problem is that it takes longer to learn the library because it's harder to
find the information and how the different parts fit together. That is

what was missing from std::streambuf said:
I guess you are talking of Nico's book in which case I would have written
or translated the IOStream portion. I don't have the book at hand and
thus I can't check what is said on page 676. However, I doubt that it
states that eback() is any good for a stream buffer user or that it is
the right thing to get the start of a buffer.

No, it merely shows where the beginning of the available data is. Figure
13.5 and associated text.

In addition, it the wrong tag to the problem anyway: to get an iterator
for a stream buffer you use 'std::istreambuf_iterator<...>'.

Yes. I now know that. Thank you for posting that example yesterday.

Right, I forgot that you keep phrasing your statements vague up to the
point of being meaningless. Have you considered becoming a politician?
These peoples job is to talk nonsense the whole day to entertain the
public with the search of any concrete statement the politicians cannot
worm their way out.

Perhaps you are not finding meaning in what I am saying because you believe
my objective are different than they are. I'm trying to explain to you
something that you cannot see. That is, what the current C++ looks like to
a person who is learning it. I believe that is a valuable perspective for
a person involved in design to understand.

I think uttering 'std::istreambuf_iterator<char>()' does not take that
much effort or intellectual capacity...

See above. Also consider that it is an easy pitfall to try and use

I'd say that

Expression Effect
streambuf _iterator<char>() Creates an end-of-stream iterator

in the mentioned paragraph is a pretty clear statement of how to get
the end iterator for a stream buffer... (I have an electronic version
of the book and thus can check for paragraphs but not for page numbers).

http://baldur.globalsymmetry.com/~hattons/Standard/doc/html/inherit__graph__35.png
I trust you were able to go to the parent directory and find the doxygen
output. Sorry about that, I decided to update the files and forgot about
the consequence it might have on the file names.
This is basically the same diagram:
http://gcc.gnu.org/onlinedocs/libstdc++/latest-doxygen/inherit__graph__98.png

This is a broken link. Anyway, you don't want to educate me about the
IOStreams or stream buffer class hierarchies, do you? I know pretty
well how the C++ I/O mechanisms work, you know. ... and I think they
are pretty simple to use once you have accepted the basic ideas of
using extractors and inserters and/or the use of sequences.

And knowing which iterators are available, as well as what flags to use to
modify the behavior of the stream. Also relevant is the distinction
between formatted and unformatted operations. It is well advised also to
understand that unformatted streams may still alter the underlying data,
and that formatted operations may alter the data in a so-called "binary"
stream. It is easy to put the stream into "stupid" format states where the
wrong combination of flags is set causing the numeric format to behave as
if the default were set. strm.clear(flag) unsets everything _but_ flag.
strm.setf(flags) doesn't mean `set flags' even though that's what it does.
Then there is 'basic_ios' verses 'ios_base'. std::string::clear() does
something completely different than std::stringstream::clear(). Reading a
stream completely and correctly puts it into a fail state.

Then there are the necessary complexities of the design to consider.

For most
of the normal work you don't have to understand these hierarchies at
all! You just operate on an 'std::istream' or an 'std:stream' - you
can even get a 'std::istreambuf_iterator' with an 'std::istream'.

So what happens when I have a C-style file pointer returned from a library?
Can I use that with C++ I/O?

You pasted some code there, I didn't see any idea not to mention anything
having advantages (well, you didn't state what it would have advantages
over...).

The advantage to that code is simple. It provides support for code
completion. I now realize it can be improved by subgroupping flags
according to their related fields. It's a very similar concept to scoped
enumerators. The biggest drawback to adding anything like that is that
there are already too many ways to do the same thing. It would add clutter
to an already cluttered interface. There are several things that could be
done with the basic concept that would provide a more usable interface than
the existing one. Unfortunately, the existing interface will remain, no
matter what.

=?ISO-8859-15?Q?Juli=E1n?= Albo · Jul 30, 2005

Steven said:
access to. One of the more difficult things about using the STL model is
that the association between iterators, and the objects they are designed
to work in conjunction with is not appearant from looking at the object.
This is one of the advantages to object oriented programming. The
functions that have access to member data are present in the class
interface. In the case of std::streambuf<> I cannot simply look at the
interface and see that std::istreambuf_iterator<> is available in the
library. The problem represents the downside of loose coupling. In Java,
iterators are typically provided by a factory method of the container.
This is somewhat similar to the STL containers that provide iterator
typedefs as well as begin() and end() as part of their definition.

And what is the contaniner in that case? You seem to think that the iterator
is in some way part of the iterated thing, and that is not always
neccessary true. You want that when any wants to write a new type of
iteration over some thing, that thing must be modified? For the only
benefit that you can know all iterations avaliable by looking at his
header? Sorry, no. I will not use a language with design guidelines like
that.

I want a language written under the assumption that programmers can read
documentation. Perhpas this is that Stroustrup refers to when he says
"serious programmer"?

Steven T. Hatton · Jul 30, 2005

Dietmar said:
My understanding is that interfaces cannot be copyrighted. The actual file
can, the semantics cannot. That is, if someone specified a library having
the same interface and semantics as some other library, using his own
words for the description, this would not be a copyright infringement.

However, "taking" something is not how the C++ committee operates, anyway.
The details of how the C++ committee operates were discussed publically in
the past. That is, you could have had all necessary information but you
decided to make insulting statements instead.

It was not at all intended to be insulting. I am baffled about how you could
find that insulting. I was simply thinking about this statement by the
inventor of the I/O streams:

"A major goal in my original design was that it be extensible in interesting
ways. In particular, in the stream library the streambuf class was an
implementation detail, but in the iostream library I intended it to be a
usable class in its own right. I was hoping for the promulgation of many
streambufs with varied functionality. I wrote a few myself, but almost no
one else did. I answered many more questions of the form "how do I make my
numbers look like this" than "how do I write a streambuf". And textbook
authors also tended to ignore streambufs. Apparently they did not share my
view that the architecture of the input/output library was an interesting
case study."

I realized that there are a lot of things available that people have open
sourced. I tried to think of a reason why there wasn't that kind of
contribution to the C++ Standard Library. What I was really suggesting is
that some of the people who had created these open source libraries would
be happy to have the used, but there might be legal issue that precluded
that.

You are looking for insults where there are none. When I insult you, you
will know it. Trust me.

Steven T. Hatton · Jul 30, 2005

Julián Albo said:
And what is the contaniner in that case? You seem to think that the
iterator is in some way part of the iterated thing, and that is not always
neccessary true. You want that when any wants to write a new type of
iteration over some thing, that thing must be modified? For the only
benefit that you can know all iterations avaliable by looking at his
header? Sorry, no. I will not use a language with design guidelines like
that.

You seem to be confused between my making an observation with the objective
of identifying problem areas so that improvements might be made, and the
false idea that I expect absolute adherence to some design principle. I
was stating facts as I see them.

I want a language written under the assumption that programmers can read
documentation. Perhpas this is that Stroustrup refers to when he says
"serious programmer"?

"Let me outline the program development environment I'd like for C++. First
of all, I want incremental compilation. When I make a minor change, I want
the ``system'' to note that it was minor, and have the new version compiled
and ready to run in a second. Similarly, I want simple requests, such as
``Show me the declaration of this f?'' ``what fs are in scope here?''
``what is the resolution of this use of +?'' ``Which classes are derived
from class Shapen?'' and ``what destructors are called at the end of this
block?'' answered in a second.

"A C++ program contains a wealth of information that in a typlical
environment is available only to a compiler. I want that information at
the programmer's fingertips." - D&E § 9.4.4 /Beyond Files and Syntax/

Dietmar Kuehl · Jul 30, 2005

Steven said:
In the US we have a traditional children's story called _Hansel and
Gretel_,

I'd guess that it actually originates from Germany, at least with this
name (which actually is "Hänsl und Gretel").

Now consider a typical STL algorithm. You cannot look at an STL container
and see which algorithms are associated with it.

That is easy to figure out because the answer is simple and generic:
if there are no algorithms in form of member functions, there is no
algorithm associated with a container. That is because the algorithms
are generic and independent of the container. Actually, they don't
even require a container in the first place. Essentially, you can use
all algorithms you get appropriate iterators for, i.e. you would
figure out which iterators exists or which you want to create. This
provides the scope of algorithms. Of course, the set of iterators
related to a container is also open: beyond a simple iteration over
all elements you can have e.g. filtering iterators which skip certain
elements.

That is what was missing from std::streambuf<>.

Well, the stream buffer iterators are entirely independent from
'std::streambuf'! They don't use any private or protected members but
the normal public interface. Thus, it makes no sense to reference
them from their members.

No, it merely shows where the beginning of the available data is. Figure
13.5 and associated text.

Neither the picture nor the text implies, however, that 'eback()' is
in some form the beginning of the input. It the beginning of the input
buffer but this is quite different from the input. For typical stream
buffers the input comes to rest at least temporarily in the input
buffer but that's all.

I'm trying to explain to you
something that you cannot see. That is, what the current C++ looks like
to a person who is learning it. I believe that is a valuable perspective
for a person involved in design to understand.

I think I have a pretty good view of how C++ looks to people learning
it, especially when it comes to the library part. ...and, indeed, it
is non-trivial and it has its own idioms which are different from
idioms used in other most other language. What is kind of news to me
is that somebody learning a language insists in twisting it to become
a different language...

And knowing which iterators are available, as well as what flags to use to
modify the behavior of the stream. Also relevant is the distinction
between formatted and unformatted operations. It is well advised also to
understand that unformatted streams may still alter the underlying data,
and that formatted operations may alter the data in a so-called "binary"
stream. It is easy to put the stream into "stupid" format states where
the wrong combination of flags is set causing the numeric format to behave
as if the default were set. strm.clear(flag) unsets everything _but_ flag.
strm.setf(flags) doesn't mean `set flags' even though that's what it does.
Then there is 'basic_ios' verses 'ios_base'. std::string::clear() does
something completely different than std::stringstream::clear(). Reading a
stream completely and correctly puts it into a fail state.

Then there are the necessary complexities of the design to consider.

Apparently we have rather different approaches to learning about a
language and/or a library. My approach is to get an overview of the
whole language or library and then obtain the details once I need
them. Your approach seems to pick whatever looks like it could be
useful for a task and then twist it trying to make it what you want
it to. For the binary formatting stuff it would have been obvious that
there is no direct support and what to do about it. No need to touch
even a single formatting flag and only a need to adjust the open mode
for files.

So what happens when I have a C-style file pointer returned from a
library? Can I use that with C++ I/O?

Sure you can. You would wrap it into a stream buffer. Actually, this
is how 'std::basic_filebuf' is implemented on some systems...

The advantage to that code is simple. It provides support for code
completion.

Code completion is good for the knowledgable who needs to be reminded.
It is evil for everybody else. In summary, I think it is better removed
from development tools because the knowledgable are quick enough in
locating the stuff anyway.

There are several things that could be
done with the basic concept that would provide a more usable interface
than the existing one.

Indeed, there are. Somehow I seriously doubt that the improvements
you can imagine have even a tangtial intersection with the improvements
I can imagine...

=?ISO-8859-15?Q?Juli=E1n?= Albo · Jul 30, 2005

Steven said:
You seem to be confused between my making an observation with the
objective of identifying problem areas so that improvements might be made,
and the alse idea that I expect absolute adherence to some design
principle. I was stating facts as I see them.

Then I think you are loosing your time. And ours.

Dietmar Kuehl · Jul 30, 2005

Steven said:
It was not at all intended to be insulting. I am baffled about how you
could find that insulting.

I guess you didn't realize whom you were discussing with: P.J.Plauger
was a long term chair of the library working group and major parts of
the original standard C++ library specification were written by him.
You explicitly stated that the current specification is inferior to
other options and that it is just due to licensing issues. I consider
this a pretty severe insult.

I was simply thinking about this statement by the
inventor of the I/O streams:

"A major goal in my original design was that it be extensible in
interesting ways. In particular, in the stream library the streambuf class
was an implementation detail, but in the iostream library I intended it to
be a usable class in its own right. I was hoping for the promulgation of
many streambufs with varied functionality. I wrote a few myself, but
almost no one else did. I answered many more questions of the form "how do
I make my numbers look like this" than "how do I write a streambuf". And
textbook authors also tended to ignore streambufs. Apparently they did not
share my view that the architecture of the input/output library was an
interesting case study."

I think I can understand how Jerry was feeling about it and, without
knowing (I think I haven't seen this quote before) I helped correcting
this issue, e.g. by explaining how to implement stream buffer in UseNet
and in Nico's book.

You are looking for insults where there are none. When I insult you, you
will know it. Trust me.

I consider it an insult when saying to someone involved in the creation
of a specification that he did a far inferior job.

Steven T. Hatton · Jul 30, 2005

Dietmar said:
I'd guess that it actually originates from Germany, at least with this
name (which actually is "Hänsl und Gretel").

I assume you are familiar with the metaphor.

Well, the stream buffer iterators are entirely independent from
'std::streambuf'! They don't use any private or protected members but
the normal public interface. Thus, it makes no sense to reference
them from their members.

I would say that it does not follow a dominate concept in OO design. I do
believe it makes sense to do /something/ to show there is an association
between the stream buffer and the iterator. What that something is, I am
not sure.

Though it probably doesn't apply here, this situation is related to the
reason I suggested there be an additional access specifier called
'associated'. An associated member of a class would not have privileged
access to any of the class members. It would simply add the function to
the view we have of a class. I further suggested that classes which shared
a particular function, say an overload of the '*' operator, might be
derived from a common base which provided the operator. A vector and a
matrix might fit this category.

I submit that the fact that the iterator is external to the stream buffer
*_may_* indicate it lives at the wrong level of abstraction.

Neither the picture nor the text implies, however, that 'eback()' is
in some form the beginning of the input. It the beginning of the input
buffer but this is quite different from the input. For typical stream
buffers the input comes to rest at least temporarily in the input
buffer but that's all.

What about a file?

I think I have a pretty good view of how C++ looks to people learning
it, especially when it comes to the library part. ...and, indeed, it
is non-trivial and it has its own idioms which are different from
idioms used in other most other language. What is kind of news to me
is that somebody learning a language insists in twisting it to become
a different language...

Me?! I reject the accusation! This may not be perfect, but you tell me what
language it's written in:

#include <iostream>
#include <fstream>
#include <iomanip>
#include <sstream>
#include <string>

namespace hexlite {
using namespace std;
typedef string::const_iterator c_itr;

ostream& printline(c_itr start, c_itr stop, ostream& out) {
while(start<stop) out
<<setw(2)
<<(static_cast<unsigned int>(static_cast<unsigned char>(*start++)))<<"
";

return out;
}

ostream& dump(const string& dataString, ostream& out) {

ostream hexout(out.rdbuf());
hexout.setf(ios::hex, ios::basefield);
hexout.fill('0');

c_itr from (dataString.begin());
c_itr dataEnd (from + dataString.size());
c_itr end (dataEnd - (dataString.size()%16));

for(c_itr start = from; start < end; start += 16)
printline(start, start + 16, hexout)<<endl;

printline(end, dataEnd, hexout)<<endl;
return out;
}
}

int main(int argc, char* argv[]) {
if (argc < 1) { std::cerr<<"enter a file name"<<std::endl; return -1; }

std::ifstream inf(argv[1],std::ios::binary);
if(inf) {
std:

stringstream oss;
oss << inf.rdbuf();
hexlite::dump(oss.str(), std::cout);
return 0;
}
std::cerr <<"\nCan't open file:"<<argv[1]<<std::endl;
return -1;
}

That started out as 284 lines of "C++" code. There's a small amount of
functionality I would need to add to get it back to the original target
functionality. The original code had a problem which is the problem I was
trying to solve when I started this thread.

Apparently we have rather different approaches to learning about a
language and/or a library. My approach is to get an overview of the
whole language or library and then obtain the details once I need
them. Your approach seems to pick whatever looks like it could be
useful for a task and then twist it trying to make it what you want
it to.

I would not have gone nearly as deep into the I/O at this point had it not
been for the number of replies that were of the nature of:

'You idiot! Why are you using a char not an unsigned char?'
'You idiot! Why are you using a unsigned chare and not a char?'
'You idiot! Why are you using a stringstream and not a vector?'
etc.

For the binary formatting stuff it would have been obvious that
there is no direct support and what to do about it. No need to touch
even a single formatting flag and only a need to adjust the open mode
for files.

If I had found this earlier, it would have saved me a lot of time:

http://gcc.gnu.org/onlinedocs/libstdc++/27_io/howto.html

How to get an unadulterated bit stream is not very clear from having a
simple overview of the library. I also chose two bad places to look for
the information.

Indeed, there are. Somehow I seriously doubt that the improvements
you can imagine have even a tangtial intersection with the improvements
I can imagine...

How many years of C and C++ experience do you have?

Steven T. Hatton · Jul 30, 2005

Dietmar said:
I guess you didn't realize whom you were discussing with: P.J.Plauger
was a long term chair of the library working group and major parts of
the original standard C++ library specification were written by him.
You explicitly stated that the current specification is inferior to
other options and that it is just due to licensing issues. I consider
this a pretty severe insult.

No. I merely suggested there were some features of the other implementation
that had some attractiveness. I specifically made it clear that directly
emulating the Java I/O would break C++ I/O. If I challenged him to rethink
the current implementation, and he found that insulting, so be it.

I think I can understand how Jerry was feeling about it and, without
knowing (I think I haven't seen this quote before) I helped correcting
this issue, e.g. by explaining how to implement stream buffer in UseNet
and in Nico's book.

I consider it an insult when saying to someone involved in the creation
of a specification that he did a far inferior job.

But I did not say that. What I was specifically thinking about was the way
Java I/O interfaces fit together as "filters". That's the part I like. I
fully agree, the character level suff like reading in mixed character based
and numeric based data plain sucks in Java when compared to C++. Now, in
comparison to C... Consider something else, I bet my entire future on
learning C++, when I had a reasonable grasp of Java in a very lucrative
area to start with.

Dietmar Kuehl · Jul 31, 2005

Steven said:
Though it probably doesn't apply here, this situation is related to the
reason I suggested there be an additional access specifier called
'associated'. An associated member of a class would not have privileged
access to any of the class members.

The problem is not that much how to name any access specifier or whatever.
The problem is that you would need to open the class everytime you added
something which could possibly used with that class. ... and this is just
the tip of the iceberg: in C++, you can even use many things never
directly intended to be used with a specific class. That is, if a class
is implemented to conform to a certain concept, it can be used everywhere
this concept is used.

I submit that the fact that the iterator is external to the stream buffer
*_may_* indicate it lives at the wrong level of abstraction.

Not at all! Actually, making it an internal aspect of a stream buffer
would probably be an error! In fact, it is quite common to create iterators
for entities which were never intended to be used with iterator, often due
to the fact that they were created when [at least STL] iterators were not
invented (e.g. something like UNIX readdir(3) has a fair chance to predate
the overall concept of an iterator but there are STL iterators using it).
C++ is not built around object orientation (anymore, at least) because it
turned out that object orientation does not solve all problems.

What about a file?

Specifically files will come to rest only blockwise in the file buffer's
character buffer. Well, the exact size depends on the implementation but
processing files in multiples of the file system's block size is a natural
thing to do. Of course, a file buffer implemented on top of 'FILE*' which
is also a viable approach may never buffer more than just one character in
the file buffer and delegate the actual buffering to the 'FILE*'
functions.

Me?! I reject the accusation!

Apparently, there are two persons then who post under the name "Steve
T. Hatton". For a reference you might want to have a look at the
following articles where this other "Steve T. Hatton" at least wants to
discuss changes to C++:

<http://groups-beta.google.com/group/comp.std.c++/msg/e9fc2c6b9d11453e?hl=en&>
<http://groups-beta.google.com/group/comp.std.c++/msg/2e7702a4ff7b098d?hl=en&>
<http://groups-beta.google.com/group/comp.lang.c++/msg/f5bb61a26494adf9?hl=en&>
<http://groups.google.com/group/comp.lang.c++/msg/1ec45f94a87ac66d?hl=en&>

Interestingly enough, the last message is actually from this very thread.
I would have thought that you had realized that a different person is
posting under the name you do...

I would not have gone nearly as deep into the I/O at this point had it not
been for the number of replies that were of the nature of:

'You idiot! Why are you using a char not an unsigned char?'
'You idiot! Why are you using a unsigned chare and not a char?'
'You idiot! Why are you using a stringstream and not a vector?'
etc.

Well, assuming you perception is that I called you implicitly an idiot,
I think you are wrong. Generally, I just point people into the correct
direction and possibly state why the thing they did didn't work. Of
course, this is made easier if they stated what they wanted rather than
showing their attempt at a solution: if the specification of the problem
is not stated it is essentially impossible to figure the exact goals out
from a wrong attempt.

How many years of C and C++ experience do you have?

I don't know how this matters but it is fairly easy to find these numbers
out (with a little bit of calculation) since I have posted them in the
past, anyway. I'm programming about 15 years with C++ and I started C a
little bit earlier.

Dietmar Kuehl · Jul 31, 2005

Steven said:
But I did not say that.

You might want to reread the articles in this thread hierarchy. In a rush
I can't find a single statement which makes this statement but if I just
sum up what I reread right now, I come to the conclusion that you did.

What I was specifically thinking about

What you think is something different from what people read in UseNet
from your writing.

was the way Java I/O interfaces fit together as "filters". That's the
part I like.

Great. C++ has this since ages. Of course, the C++ library distracts a
little from this fact by bundling up commonly used configurations into
a simple class.

=?ISO-8859-15?Q?Juli=E1n?= Albo · Jul 31, 2005

Dietmar said:
Apparently, there are two persons then who post under the name "Steve
T. Hatton". For a reference you might want to have a look at the
following articles where this other "Steve T. Hatton" at least wants to
discuss changes to C++:

Maybe the accusation rejected is to be learning C++ X-)

Steven T. Hatton · Aug 1, 2005

Dietmar said:
You might want to reread the articles in this thread hierarchy. In a rush
I can't find a single statement which makes this statement but if I just
sum up what I reread right now, I come to the conclusion that you did.

If I was giving the impression that I believe there is something very
problematic with the C++ I/O implementation, then I communicated
effectively. The very fact that the statements quoted below[*] were made
indicates to me there are some serious shortcomings with C++ I/O.

When I read Stroustrup's treatment of I/O, he said "It is possible to define
streams for which the physical I/O is not done in terms of characters.
However, such streams are beyond the scope of the C++ standard and beyond
the scope of this book." - §21.2.1. Now, it's not clear to me what,
exactly, that means. Is he talking about streams where the data type is
other than type char, or streams where the data is not processed in terms
of characters sets? I suspect the answer would be "yes".

When I read Josuttis's book, I don't recall anywhere that clearly explains
the distinction between processing character based data and processing
"raw" data. All the details may be there, but I would be hard-pressed to
find them. One person told me that I should use the std:

stream::read()
and std:

stream::write() functions. But another person wrote this:

"... and, BTW, using 'std::istream::read()' and 'std:

stream::write()'
is almost certainly the *wrong* approach! These functions are an
historical mistake which should have been corrected in the standard:..."

Plauger's documentation, the place I looked first for an answer clued me
onto the idea that the openmode of the stream would be important, but it
did not discuss the ramifications of using formatting vs. non-formatting
functions. The impression it gave was that if I open a stream in binary
mode, it will faithfully reproduce the data exactly as it exists prior to
passing through the stream. The documentation then went on to discuss I/O
using functions from the C libraries.

I don't recally any discussion in Accelerated C++ about working with
unformatted binary data. A while back I tried reading the section in the
Standard, but I found it to be difficult to follow because of the wording
regarding the structure of the clause, I therefore never finished reading
the clause. I now believe that I would have benefitted by actual taking
the time to figure it all out.

I have been told that I should have been working with unsigned char, I've
also been told I was wrong to be working with unsigned char. I've been
told I can set the stream to binary and use the extractor functions if I
unset the formatting bits. I've also been told that this will not work
(and I believe there could be performance issues with this as well). See
above for as similar example with read() and write().

What you think is something different from what people read in UseNet
from your writing.

Well, there is the problem that what people read into my posts is colored by
what other people write. When I read the comments of Jerry Schwartz
regarding the scarcity of new stream classes, in conjunction with a comment
suggesting that I was free to contribute my own, I suggested that part of
the reason there had not been as much contribution to the C++ Standard
Library was because of the nature of the licensing of libraries that I know
to exist. I was accused of arrogance for that. The fact of the matter is,
I was merely extrapolating from this:

"How is the Boost license different from the GNU General Public License
(GPL)? The Boost license permits the creation of derivative works for
commercial or non-commercial use with no legal requirement to release your
source code."

The next thing I know Plauger, and others are also accusing me of arrogance
for having made that statement.

Great. C++ has this since ages. Of course, the C++ library distracts a
little from this fact by bundling up commonly used configurations into
a simple class.

To some extent the C++ I/O library is a kludge. That is simply a result of
history. That's the way things happen when new designs are built directly
on top of older ones. There are good reasons people did things this way.
Nonetheless, it causes problems.

To a considerable extent, the C++ iostream classes _can_ be understood as
filters. The stream buffers represent an unfiltered stream, and the stream
classes represent filters. One thing that seems out of order about this
view is that the openmode is set at a higher level than the level at which
unformatted I/O is performed.

[*]
http://gcc.gnu.org/onlinedocs/libstdc++/27_io/howto.html#3
<quote>
Binary I/O
The first and most important thing to remember about binary I/O is that
opening a file with ios::binary is not, repeat not, the only thing you have
to do. It is not a silver bullet, and will not allow you to use the <</>>
operators of the normal fstreams to do binary I/O.
Sorry. Them's the breaks.
This isn't going to try and be a complete tutorial on reading and writing
binary files (because "binary" covers a lot of ground), but we will try and
clear up a couple of misconceptions and common errors.
First, ios::binary has exactly one defined effect, no more and no less.
Normal text mode has to be concerned with the newline characters, and the
runtime system will translate between (for example) '\n' and the
appropriate end-of-line sequence (LF on Unix, CRLF on DOS, CR on Macintosh,
etc). (There are other things that normal mode does, but that's the most
obvious.) Opening a file in binary mode disables this conversion, so
reading a CRLF sequence under Windows won't accidentally get mapped to a
'\n' character, etc. Binary mode is not supposed to suddenly give you a
bitstream, and if it is doing so in your program then you've discovered a
bug in your vendor's compiler (or some other part of the C++
implementation, possibly the runtime system).
Second, using << to write and >> to read isn't going to work with the
standard file stream classes, even if you use skipws during reading. Why
not? Because ifstream and ofstream exist for the purpose of formatting, not
reading and writing. Their job is to interpret the data into text
characters, and that's exactly what you don't want to happen during binary
I/O.
Third, using the get() and put()/write() member functions still aren't
guaranteed to help you. These are "unformatted" I/O functions, but still
character-based. (This may or may not be what you want, see below.)
</quote>

Steven T. Hatton · Aug 1, 2005

Dietmar said:
The problem is not that much how to name any access specifier or whatever.
The problem is that you would need to open the class everytime you added
something which could possibly used with that class.

No. There would be no requirement that all operations on the class be done
through associated methods. This would simply be a means of collecting
things which are logically related into selfcontained units.

... and this is just
the tip of the iceberg: in C++, you can even use many things never
directly intended to be used with a specific class. That is, if a class
is implemented to conform to a certain concept, it can be used everywhere
this concept is used.

But if there were some hypothetical tag to indicate explicitly that a
certain concept were expressed by a given class, and another tag to
indicate that a different class requires that concept, then there might be
a way of automattically generating a mapping between the two kinds of
class. Such a technique might be useful as a means of meta-type-checking.

Say, for example, if your iterators have traits indicating their category,
your container, or other class could have similar traits indicating what
type of iteration they support. But then...

This is why I laugh when I think of template parameters as meta-types.

I submit that the fact that the iterator is external to the stream buffer
*_may_* indicate it lives at the wrong level of abstraction.

Click to expand...

Not at all! Actually, making it an internal aspect of a stream buffer
would probably be an error! In fact, it is quite common to create
iterators for entities which were never intended to be used with iterator,
often due to the fact that they were created when [at least STL] iterators
were not invented (e.g. something like UNIX readdir(3) has a fair chance
to predate the overall concept of an iterator but there are STL iterators
using it). C++ is not built around object orientation (anymore, at least)
because it turned out that object orientation does not solve all problems.

eback(), gpter(), and egptr() /are/ iterators. They are, however, not
iterators that enforce the correct invariant for a user of the stream. I
understand the basic idea of what streambuf_iterators are doing. They are
enforcing an invariant for the stream while providing sequential access to
its elements. They exploit the public interface to do so.

Specifically files will come to rest only blockwise in the file buffer's
character buffer. Well, the exact size depends on the implementation but
processing files in multiples of the file system's block size is a natural
thing to do. Of course, a file buffer implemented on top of 'FILE*' which
is also a viable approach may never buffer more than just one character in
the file buffer and delegate the actual buffering to the 'FILE*'
functions.

That makes sensse. Now, if I open a file, read it into, say a string
stream, then clear() it and set seekg back to the beginning, under normal
circumstances (file is not huge) I believe the file is typically read in
one operation, and worked on in memory. But I really don't know how that
relates to the structure of the iostreams implementation on my system. I
would need to read the code, which I'm not going to do right now.

Apparently, there are two persons then who post under the name "Steve
T. Hatton". For a reference you might want to have a look at the
following articles where this other "Steve T. Hatton" at least wants to
discuss changes to C++:

<http://groups.google.com/group/comp.lang.c++/msg/1ec45f94a87ac66d?hl=en&>

Interestingly enough, the last message is actually from this very thread.
I would have thought that you had realized that a different person is
posting under the name you do...

None of that supports your assertion. In most of it, I made no suggestion
regarding any change to the language. Of particular interest is the post
discussing C++ classes that act like Java or C# classes, that is exactly
what I _did_not_ suggest.

Well, assuming you perception is that I called you implicitly an idiot,

No. There have been some rather 'interesting' comments from others,
however.

I think you are wrong. Generally, I just point people into the correct
direction and possibly state why the thing they did didn't work. Of
course, this is made easier if they stated what they wanted rather than
showing their attempt at a solution: if the specification of the problem
is not stated it is essentially impossible to figure the exact goals out
from a wrong attempt.

I don't know how this matters but it is fairly easy to find these numbers
out (with a little bit of calculation) since I have posted them in the
past, anyway. I'm programming about 15 years with C++ and I started C a
little bit earlier.

I got my first formal introduction to digital I/O at the bit-level in 1979.
I may not know C++'s I/O that well, and I have not focused on I/O at that
level for the past two and a half decades, but I do have a solid background
in the subject.

Dietmar Kuehl · Aug 1, 2005

Steven said:
But if there were some hypothetical tag to indicate explicitly that a
certain concept were expressed by a given class, and another tag to
indicate that a different class requires that concept, then there might be
a way of automattically generating a mapping between the two kinds of
class. Such a technique might be useful as a means of meta-type-checking.

We can agree on this. It is, however, not yet there but it is under
discussion: there have been two independent proposals to add concepts
to C++, mostly to do some form of concept checking and to remove the
need for certain kinds of meta programming. This does still not help
to locate e.g. the stream [buffer] iterators for streams because they
are implemented entirely independent from the stream. This can only be
located with some form of use dependency.

eback(), gpter(), and egptr() /are/ iterators. They are, however, not
iterators that enforce the correct invariant for a user of the stream.

They are an implementation detail of stream buffers used to communicate
between the base class and its derived classes. They are iterator but
they are only relevant to implementers of stream buffers. They should
never be used by something external to the stream buffer to access the
buffer (well, there is something for which they can be used to great
effect, namely for optimizations for segmented sequences; however, even
this optimization could be made to use a safe public interface).

That makes sensse. Now, if I open a file, read it into, say a string
stream, then clear() it and set seekg back to the beginning, under normal
circumstances (file is not huge) I believe the file is typically read in
one operation, and worked on in memory.

The file stream buffer ('std::basic_filebuf<..>') is likely to hold a
buffer which is a multiple of the corresponding file system block size,
possibly plus a few additional characters to provide a put back facility.
The user of the file stream buffer has no real control over the size of
the buffer, in particular 'pubsetbuf()' is only guaranteed to work when
turning off the buffer (i.e. the call 'pubsetbuf(0, 0)'). Thus, it is
unsafe to assume that the internal buffer of file stream buffer holds
any particular content, e.g. the whole file. As mentioned before, the
buffer may actually consist of just one character (the wording in the
standard seems to imply that the buffer has to hold at least one
character if there is any although I consider this restriction
unnecessary and actually an error in the specification), e.g. because the
actual buffering is done by a 'FILE*'.

None of that supports your assertion. In most of it, I made no suggestion
regarding any change to the language.

My reading of those articles is that you indeed suggest changes to the
language (which in my reading includes the standard library).

Dietmar Kuehl · Aug 1, 2005

Steven said:
If I was giving the impression that I believe there is something very
problematic with the C++ I/O implementation, then I communicated
effectively. The very fact that the statements quoted below[*] were made
indicates to me there are some serious shortcomings with C++ I/O.

In my opinion there is only one major shortcoming with C++ I/O and this
is a lack of understanding. There seem to be only few people who really
understand the model or even want to understand it. This includes both
the technical details and the scope this library targets, although
neither is actually that hard to understand.

The first thing to note about C++ I/O (which also applies to C I/O) is
that it is very vague about things like standard input, standard output,
and files. There is a simple reason for this, though: the exact details
differ widely between different platforms and thus a definition of what
a file is and what kind of properties it has are essentially side
stepped. All what is assumed is that files have name (though no detail
is given how valid names look like) and consist of sequences of bytes
which can be read or written. When writing a file and reading it back
in later, the bytes read should be identical to those which have been
written if the files were opened in binary mode. If one or both of
these files were not opened in binary mode, the bytes read may differ
from those written.

Note, that this does not say anything about the actual bytes stored in
the file! Although it is reasonable to assume that these are identical
to those written, this is not necessarily the case and may be dependent
of the platform.

The next thing to note about C++ I/O is that it addresses essentially
text formatted I/O. It is neither intended nor particular well suited
for binary I/O although the latter can be done to some extend. This is
shown e.g by the fact that all I/O layers accessible to the user
operate in terms of characters, not in terms of bytes. Although the
normal character type ('char') is also often used to represent bytes,
it is important to keep these things mentally apart because the bytes
undergo a transformation to become characters, namely a locale specific
translation between bytes and characters. This is done at some lower
level in 'FILE* or 'std::basic_filebuf'. In both cases it is guaranteed
that this conversion is the identity conversion when using the "C"
locale. That is, to get an unchanged sequence of bytes useful for some
form of binary I/O you would need to open the file in binary mode and
suppress any locale specific character transformation by using the "C"
locale.

In particular the stream level hierarchy only makes sense for text
formatted I/O. Although it provides "unformatted" operations,
interaction between these and the formatted operations is a little bit
tricky (in particular, the formatted operations skip e.g. white space
characters while the unformatted do not; this is often a stumbling
block when switching between formatted and unformatted operations).
When you want to operate on binary files, it much more reasonable to
use stream buffers directly and provide a similar hierarchy to the
streams for reading a particular binary format.

Alf P. Steinbach · Aug 1, 2005

* Dietmar Kuehl:

In my opinion there is only one major shortcoming with C++ I/O and this
is a lack of understanding.
....

The next thing to note about C++ I/O is that it addresses essentially
text formatted I/O. It is neither intended nor particular well suited
for binary I/O although the latter can be done to some extend. This is
shown e.g by the fact that all I/O layers accessible to the user
operate in terms of characters, not in terms of bytes.

Well then, considering that most of C++ i/o text input has formally
undefined behavior, I guess a lack of understanding that point is the major
shortcoming you're talking about?

Sorry, I couldn't resist. ;-)

P.J. Plauger · Aug 1, 2005

I have been told that I should have been working with unsigned char, I've
also been told I was wrong to be working with unsigned char.

No, you were told that it is *unnecessary* to work with unsigned char.
There are also reasons to avoid this type, but we didn't get that
far.

I've been
told I can set the stream to binary and use the extractor functions if I
unset the formatting bits. I've also been told that this will not work
(and I believe there could be performance issues with this as well).

No, you were told to use *certain* extractors. Most are geared
toward formatted I/O and are hence irrelevant. What you believe
about performance is uninteresting until you make measurements
to support it.

See
above for as similar example with read() and write().

Well, there is the problem that what people read into my posts is colored
by
what other people write. When I read the comments of Jerry Schwartz
regarding the scarcity of new stream classes, in conjunction with a
comment
suggesting that I was free to contribute my own, I suggested that part of
the reason there had not been as much contribution to the C++ Standard
Library was because of the nature of the licensing of libraries that I
know
to exist. I was accused of arrogance for that.

No, I accused you of arrogance for making the statement:

(Despite Dietmar's comments, I did not take personal offense at this
statement. My contributions to the design of the Standard C++ library
have been very small. I'm more of an implementer and explainer. But
the statement certainly shows a lack of appreciation for the many
real, and sometimes highly inventive, contributions made by others
to the library. That, to me, reflects arrogance, stupidity, or both.)

I've read enough of your posts now to see that you are not a very
clear thinker. You are even worse at understanding what you read,
particularly if it challenges any of your preconcieved beliefs.
Thus, debating you is akin to shooting at liferafts.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Steven T. Hatton · Aug 1, 2005

Dietmar said:
Steven said:

If I was giving the impression that I believe there is something very
problematic with the C++ I/O implementation, then I communicated
effectively. The very fact that the statements quoted below[*] were made
indicates to me there are some serious shortcomings with C++ I/O.

Click to expand...

In my opinion there is only one major shortcoming with C++ I/O and this
is a lack of understanding. There seem to be only few people who really
understand the model or even want to understand it. This includes both
the technical details and the scope this library targets, although
neither is actually that hard to understand.

Nothing is that hard to understand, once you understand it.

I believe
you are probably correct about it not being too terribly hard to
understand. I do believe it has some unfortunate features which make it
more difficult than it needs to be. This includes cryptic names,
counterintuitive control mechanisms, and, at least one uncommon use of a
commonly used term. That term being "binary".

The first thing to note about C++ I/O (which also applies to C I/O) is
that it is very vague about things like standard input, standard output,
and files. There is a simple reason for this, though: the exact details
differ widely between different platforms and thus a definition of what
a file is and what kind of properties it has are essentially side
stepped.

I can see where trying to specify a generic file descriptor could be
problematic. However, I can't imagine it poses anything close to the
variability that locales pose. Perhaps it really is something best left
outside of the C++ Standard. My first inclination was to suggest the
Standard could provide some kind of extensible framework, and use POSIX as
a default implementation. But, then that imposes a requirement on the
implementation which may not be easy to meet.

All what is assumed is that files have name

I wonder if there is a place in the C++ Standard Library for something
similar to JNDI.

(though no detail
is given how valid names look like) and consist of sequences of bytes
which can be read or written. When writing a file and reading it back
in later, the bytes read should be identical to those which have been
written if the files were opened in binary mode. If one or both of
these files were not opened in binary mode, the bytes read may differ
from those written.

Hmmm. I seems that statement should come with some further qualification
about the role of formating operators.
....

The next thing to note about C++ I/O is that it addresses essentially
text formatted I/O. It is neither intended nor particular well suited
for binary I/O although the latter can be done to some extend. This is
shown e.g by the fact that all I/O layers accessible to the user
operate in terms of characters, not in terms of bytes. Although the
normal character type ('char') is also often used to represent bytes,
it is important to keep these things mentally apart because the bytes
undergo a transformation to become characters, namely a locale specific
translation between bytes and characters. This is done at some lower
level in 'FILE* or 'std::basic_filebuf'. In both cases it is guaranteed
that this conversion is the identity conversion when using the "C"
locale. That is, to get an unchanged sequence of bytes useful for some
form of binary I/O you would need to open the file in binary mode and
suppress any locale specific character transformation by using the "C"
locale.

I believe, by default, all my code will use the "C" locale unless I
specifically change it. I recall reading that using wchar_t instead of
char can improve performance for unformatted binary I/O, and I assume the
same qualifications hold for locale in that situation as well. No matter
whether I use char or wchar_t I have no guarantee as to the exact size nor
signedness of the objects of that type. I'm not sure what that means in
terms of reading or writing a "binary" file such as an ELF file which has
some embedded text.

In particular the stream level hierarchy only makes sense for text
formatted I/O. Although it provides "unformatted" operations,
interaction between these and the formatted operations is a little bit
tricky (in particular, the formatted operations skip e.g. white space
characters while the unformatted do not; this is often a stumbling
block when switching between formatted and unformatted operations).
When you want to operate on binary files, it much more reasonable to
use stream buffers directly and provide a similar hierarchy to the
streams for reading a particular binary format.

This is the other part of what you said was not that hard to understand
about the library. It isn't that difficult to understand. I guess the
idea of providing a general stream template for arbitrary integral types
has too many complications, such as what happens if data of type char is
put into a stream of type unsigned long one unit at a time. Should it be
packed so that it uses optimal space? etc.

Trying to work with C-style I/O in conjunction with C++ I/O seems rather
difficult. For instance, if I have a library that uses printf(), I don't
believe I can get it to write to a C++ stream. I did come across this:
<quote>
http://www.channel1.com/users/bobwb/cppnotes/lec08.htm
8.8 Mixing in C I/O

C I/O functions can be intermixed with C++ I/O portably on a line by line
basis. To use a C++ file in a C context, an ostream or istream must be
constructed with a stdiobuf (a type of streambuf - header is stdiostream.h)
which is in turn constructed from a FILE *.

Example:
FILE *fp = fopen("test.dat", "r+");
stdiobuf buf(fp);
ostream str(buf);
</quote>

And this:
http://www.boost.org/libs/format/index.html

<quote>
Boost Format library

The format library provides a class for formatting arguments according to a
format-string, as does printf, but with two major differences :

* format sends the arguments to an internal stream, and so is entirely
type-safe and naturally supports all user-defined types.
* The ellipsis (...) can not be used correctly in the strongly typed
context of format, and thus the function call with arbitrary arguments is
replaced by successive calls to an argument feeding operator%

</quote>

Steven T. Hatton · Aug 1, 2005

P.J. Plauger said:
[...]

I've been
told I can set the stream to binary and use the extractor functions if I
unset the formatting bits. I've also been told that this will not work
(and I believe there could be performance issues with this as well).

Click to expand...

No, you were told to use *certain* extractors. Most are geared
toward formatted I/O and are hence irrelevant.

When I use the term "extractor" I am specifically talking about the
overloaded shift operator. That is the only place I've seen the term used
in conjunction with C++ I/O. I /have/ been told 'don't use the overloaded
shift operators because they are for formatted data', with no qualification
added. I *believe* I can use basic_istream<char> &
operator>>(basic_streambuf<char> *sb) to process unformatted binary data.
But I don't know all the consequences, or requirements of using it. This
program works to read in a file and output it. I ran it on the executable
produced by compiling it, and the result was a file of the same size which
also executed successfully. Furthermore diff indicated the input and
output are identical.

#include <iostream>
#include <fstream>
#include <sstream>

using namespace std;

int main(int argc, char * argv[])
{
if(argc!=2) { cerr<<"test filename"<<endl; return -1; }

string filename(argv[1]);

ifstream inf(filename.c_str(), ios::binary);
if(!inf) { cerr<<"can't open "<<filename<<endl; return -1; }
inf.unsetf(ios::skipws);

stringstream ss(ios::binary|ios::in|ios:

ut);
ss.unsetf(ios::skipws);
ss<<inf.rdbuf();
cout<<"read "<<ss.str().size()<<" bytes of "<<filename<<endl;

filename += ".tmp";

ofstream off(filename.c_str(), ios::binary);
if(!off) { cerr<<"can't open "<<filename<<endl; return -1; }
off.unsetf(ios::skipws);
off<<ss.rdbuf();

return 0;

}

Note that the program worked as stated above, even without the ios::binary
and unsetf(ios::skipws). Are these needed?

Can I safely assume the for used in the program will process the data
unaltered? That is, no transformation of CR/LF, etc.

What you believe
about performance is uninteresting until you make measurements
to support it.

My suspicion may be unfounded since I don't fully understand the mechanisms
involved.

(Despite Dietmar's comments, I did not take personal offense at this
statement. My contributions to the design of the Standard C++ library
have been very small. I'm more of an implementer and explainer. But
the statement certainly shows a lack of appreciation for the many
real, and sometimes highly inventive, contributions made by others
to the library. That, to me, reflects arrogance, stupidity, or both.)

That is an absurd statement. My comments in no way provide any indication
of my assessment of the amount of work that has gone into the library that
currently exists. There are many features that could be in the library
that aren't. There are many possible explanations for these features not
to be there. I was merely speculating as to one possible reason why they
are not included.

There is also another possible reason why many people would withold valuable
contributions to the C++ Standard Library.
[...]
I am not the subject of this newsgroup. Please see the FAQ for information
regarding the intended content of the newsgroup and proper etiquette when
posting. You can find a copy of the FAQ here:

http://www.parashift.com/c++-faq-lite/

portable std::getline and line terminators	2	Jun 8, 2009
writing binary file (ios::binary)	9	Apr 25, 2008
Uploading images - binary or unsupported text encoding	2	Dec 24, 2022
get stream mode flags from an opened stream	1	Feb 17, 2007
read in text & binary file	6	Mar 26, 2010
Multiple files in one	8	Sep 12, 2011
std::fstream::seekp( ... std::ios::end )	2	Apr 7, 2009
reading binary file into memory. Converting from char to uint32,float, double, ASCII strings etc (st	37	Oct 15, 2011

std::ios::binary?

Steven T. Hatton

=?ISO-8859-15?Q?Juli=E1n?= Albo

Steven T. Hatton

Steven T. Hatton

Dietmar Kuehl

=?ISO-8859-15?Q?Juli=E1n?= Albo

Dietmar Kuehl

Steven T. Hatton

Steven T. Hatton

Dietmar Kuehl

Dietmar Kuehl

=?ISO-8859-15?Q?Juli=E1n?= Albo

Steven T. Hatton

Steven T. Hatton

Dietmar Kuehl

Dietmar Kuehl

Alf P. Steinbach

P.J. Plauger

Steven T. Hatton

Steven T. Hatton

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads