Alf said:
* Steven T. Hatton:
Basically it means that the i/o functions should not do any translation
to/from external representation of the data: they should behave as they
really should have behaved by default, but unfortunately do not.
For example, a Windows text file has each line terminated by carriage
return + linefeed, as opposed to just linefeed in Unix, and in text mode
'\n' is translated accordingly for output, and these sequences are
translated to
'\n' on input -- in practice '\r' is carriage return and '\n' is
linefeed.
Yes. I am aware of DOS spiders. ^M is what it looks like in Emacs, and it
is never a nice thing to have in a tarball. dos2unix is a great tool.
Also, for example, in a Windows text file Ctrl Z denotes end-of-file.
That's useful for including a short descriptive text snippet at the start
of a binary file, but I suspect it was originally a misunderstanding of
the Unix shell command to send the current line immediately (which for an
empty
line means zero bytes, which in Unix indicates end-of-file). So in text
mode in Windows, a Ctrl Z might be translated to end-of-file on input.
Interestingly the C++ iostream utilities are so extremely badly designed
that you can't make a simple
copy-standard-input-to-standard-output-exactly program using only the
standard C++ library, on systems where this is meaningful but text
translation occurs in text mode.
I'm now wondering if I really understood. If I read "characters" from a
std::istream, it goes into an error state when it hits an EOF. That's why
stuff like this works (when it works)
std::vector<float_pair> positions
(istream_iterator<float_pair> (file),
(istream_iterator<float_pair> ()));
I don't believe binary files are terminated by a special character, but I
could be wrong (again).
Take this example:
####################################
Thu Jul 28 06:03:39:> cat main.cpp
#include <fstream>
#include <vector>
#include <iterator>
#include <iostream>
#include <sstream>
using namespace std;
main(int argc, char* argv[]){
if(argc<2) { cerr << "give me a file name" << endl; return -1; }
ifstream file (argv[1],ios::binary);
if(!file) { cerr << "couldn't open the file:"<< argv[1] << endl; return
-1; }
std::vector<unsigned char> data;
copy(istream_iterator<unsigned char>(file)
, istream_iterator<unsigned char>()
, back_inserter(data));
cout<<"read "<<data.size()<<"bytes of data"<< endl;
file.clear();
file.seekg(0,ios::beg);
ostringstream oss;
oss << file.rdbuf();
cout<<"read "<<oss.str().size()<<"bytes of data"<< endl;
}
Thu Jul 28 06:10:21:> g++ -obinio main.cpp
Thu Jul 28 06:13:04:> ./binio binio
read 22075bytes of data
read 22470bytes of data
##########################
Notice the second output is larger than the first.
Of course, the religious C++'ers maintain that that shouldn't be possible
anyway because you can't do it on, say, a mobile phone, where C++ could be
used for something, but then they forget that i/o is there for a reason.
I suspect there are "political" reasons things turned out the way they did.
I really don't know how much of a performance hit it would be if certain
platforms had to do some extra endian shuffling. I do believe the lack of
real binary I/O in the Standard Library is an unexpected inconvenience.
Stroustrup bluntly states that binary I/O is beyond the scope of C++
Standard, and beyond the scope of TC++PL(SE). §21.2.1
Here's an interesting observation:
compiled with gcc 3.3.5
-rwxr-xr-x 1 hattons users 40830 2005-07-28 06:22 binio-3.3.5
compiled with gcc 4.0.1
-rwxr-xr-x 1 hattons users 22470 2005-07-28 06:23 binio-4.0.1
And 4.0.1 produces (much) faster code as well.