std::ios::binary?

S

Steven T. Hatton

§27.4.2.1.4 Type ios_base::eek:penmode
Says this about the std::ios::binary openmode flag:
*binary*: perform input and output in binary mode (as opposed to text mode)

And that is basically _all_ it says about it. What the heck does the binary
flag mean?
 
A

Alf P. Steinbach

* Steven T. Hatton:
§27.4.2.1.4 Type ios_base::eek:penmode
Says this about the std::ios::binary openmode flag:
*binary*: perform input and output in binary mode (as opposed to text mode)

And that is basically _all_ it says about it. What the heck does the binary
flag mean?

Basically it means that the i/o functions should not do any translation
to/from external representation of the data: they should behave as they
really should have behaved by default, but unfortunately do not.

For example, a Windows text file has each line terminated by carriage return
+ linefeed, as opposed to just linefeed in Unix, and in text mode '\n' is
translated accordingly for output, and these sequences are translated to
'\n' on input -- in practice '\r' is carriage return and '\n' is linefeed.

Also, for example, in a Windows text file Ctrl Z denotes end-of-file.
That's useful for including a short descriptive text snippet at the start of
a binary file, but I suspect it was originally a misunderstanding of the
Unix shell command to send the current line immediately (which for an empty
line means zero bytes, which in Unix indicates end-of-file). So in text
mode in Windows, a Ctrl Z might be translated to end-of-file on input.

Interestingly the C++ iostream utilities are so extremely badly designed
that you can't make a simple copy-standard-input-to-standard-output-exactly
program using only the standard C++ library, on systems where this is
meaningful but text translation occurs in text mode.

Of course, the religious C++'ers maintain that that shouldn't be possible
anyway because you can't do it on, say, a mobile phone, where C++ could be
used for something, but then they forget that i/o is there for a reason.
 
G

Gianni Mariani

Steven said:
§27.4.2.1.4 Type ios_base::eek:penmode
Says this about the std::ios::binary openmode flag:
*binary*: perform input and output in binary mode (as opposed to text mode)

And that is basically _all_ it says about it. What the heck does the binary
flag mean?

On some platforms "\n" translates to "\r\n" when written (and the
reverse when read) to a file while on others it does not. binary mode
will make no such translation.
 
S

Steven T. Hatton

Gianni said:
On some platforms "\n" translates to "\r\n" when written (and the
reverse when read) to a file while on others it does not. binary mode
will make no such translation.

Is that per the Standard, or "implementation imposed"? I know that what you
describe is what I typically think of when I think of binary I/O. For
example, in ancient times we used to have to explicitly tell ftp to use
binary mode transfer.
 
S

Steven T. Hatton

Alf said:
* Steven T. Hatton:

Basically it means that the i/o functions should not do any translation
to/from external representation of the data: they should behave as they
really should have behaved by default, but unfortunately do not.

For example, a Windows text file has each line terminated by carriage
return + linefeed, as opposed to just linefeed in Unix, and in text mode
'\n' is translated accordingly for output, and these sequences are
translated to
'\n' on input -- in practice '\r' is carriage return and '\n' is
linefeed.

Yes. I am aware of DOS spiders. ^M is what it looks like in Emacs, and it
is never a nice thing to have in a tarball. dos2unix is a great tool.
Also, for example, in a Windows text file Ctrl Z denotes end-of-file.
That's useful for including a short descriptive text snippet at the start
of a binary file, but I suspect it was originally a misunderstanding of
the Unix shell command to send the current line immediately (which for an
empty
line means zero bytes, which in Unix indicates end-of-file). So in text
mode in Windows, a Ctrl Z might be translated to end-of-file on input.

Interestingly the C++ iostream utilities are so extremely badly designed
that you can't make a simple
copy-standard-input-to-standard-output-exactly program using only the
standard C++ library, on systems where this is meaningful but text
translation occurs in text mode.

I'm now wondering if I really understood. If I read "characters" from a
std::istream, it goes into an error state when it hits an EOF. That's why
stuff like this works (when it works)

std::vector<float_pair> positions
(istream_iterator<float_pair> (file),
(istream_iterator<float_pair> ()));

I don't believe binary files are terminated by a special character, but I
could be wrong (again).

Take this example:
####################################
Thu Jul 28 06:03:39:> cat main.cpp
#include <fstream>
#include <vector>
#include <iterator>
#include <iostream>
#include <sstream>

using namespace std;


main(int argc, char* argv[]){
if(argc<2) { cerr << "give me a file name" << endl; return -1; }

ifstream file (argv[1],ios::binary);

if(!file) { cerr << "couldn't open the file:"<< argv[1] << endl; return
-1; }

std::vector<unsigned char> data;

copy(istream_iterator<unsigned char>(file)
, istream_iterator<unsigned char>()
, back_inserter(data));

cout<<"read "<<data.size()<<"bytes of data"<< endl;

file.clear();
file.seekg(0,ios::beg);
ostringstream oss;
oss << file.rdbuf();
cout<<"read "<<oss.str().size()<<"bytes of data"<< endl;

}

Thu Jul 28 06:10:21:> g++ -obinio main.cpp

Thu Jul 28 06:13:04:> ./binio binio
read 22075bytes of data
read 22470bytes of data

##########################

Notice the second output is larger than the first.
Of course, the religious C++'ers maintain that that shouldn't be possible
anyway because you can't do it on, say, a mobile phone, where C++ could be
used for something, but then they forget that i/o is there for a reason.

I suspect there are "political" reasons things turned out the way they did.
I really don't know how much of a performance hit it would be if certain
platforms had to do some extra endian shuffling. I do believe the lack of
real binary I/O in the Standard Library is an unexpected inconvenience.
Stroustrup bluntly states that binary I/O is beyond the scope of C++
Standard, and beyond the scope of TC++PL(SE). §21.2.1

Here's an interesting observation:
compiled with gcc 3.3.5
-rwxr-xr-x 1 hattons users 40830 2005-07-28 06:22 binio-3.3.5
compiled with gcc 4.0.1
-rwxr-xr-x 1 hattons users 22470 2005-07-28 06:23 binio-4.0.1

And 4.0.1 produces (much) faster code as well.
 
A

Alf P. Steinbach

* Steven T. Hatton:
* Alf P. Steinbach:
[snip]
For example, a Windows text file has each line terminated by carriage
return + linefeed, as opposed to just linefeed in Unix, and in text mode
'\n' is translated accordingly for output, and these sequences are
translated to
'\n' on input -- in practice '\r' is carriage return and '\n' is
linefeed.

Yes. I am aware of DOS spiders. ^M is what it looks like in Emacs,

That's because a carriage return is ASCII 13, Ctrl M.

It would be different using EBCDIC, I imagine.

;-)


[snip]
I'm now wondering if I really understood. If I read "characters" from a
std::istream, it goes into an error state when it hits an EOF. That's why
stuff like this works (when it works)

std::vector<float_pair> positions
(istream_iterator<float_pair> (file),
(istream_iterator<float_pair> ()));

I don't believe binary files are terminated by a special character, but I
could be wrong (again).

Not in Unix, and not in Windows. However a binary file can contain any byte
values, and those that look like end-of-line markers will be translated in
text mode, and the first that looks like an end-of-file marker may be
translated. All depending on the implementation.


[snip]
ifstream file (argv[1],ios::binary);

if(!file) { cerr << "couldn't open the file:"<< argv[1] << endl; return
-1; }

std::vector<unsigned char> data;

copy(istream_iterator<unsigned char>(file)
, istream_iterator<unsigned char>()
, back_inserter(data));

cout<<"read "<<data.size()<<"bytes of data"<< endl;

file.clear();
file.seekg(0,ios::beg);
ostringstream oss; [snip]

##########################

Notice the second output is larger than the first.

That's probably because the ostringstream does some text mode shenanigans;
although I haven't checked.

I suspect there are "political" reasons things turned out the way they did.
I really don't know how much of a performance hit it would be if certain
platforms had to do some extra endian shuffling.

? The _default_ is translation. That is, the default is the overhead &
performance hit (+ other much more evil effects) you're mentioning.

I do believe the lack of
real binary I/O in the Standard Library is an unexpected inconvenience.
Stroustrup bluntly states that binary I/O is beyond the scope of C++
Standard, and beyond the scope of TC++PL(SE). §21.2.1

There is no §21.2.1.
 
S

Steven T. Hatton

Alf said:
* Steven T. Hatton:

Not in Unix, and not in Windows. However a binary file can contain any
byte values, and those that look like end-of-line markers will be
translated in text mode, and the first that looks like an end-of-file
marker may be
translated. All depending on the implementation.

And they call it a "standard"? :/
[snip]
ifstream file (argv[1],ios::binary);

if(!file) { cerr << "couldn't open the file:"<< argv[1] << endl; return
-1; }

std::vector<unsigned char> data;

copy(istream_iterator<unsigned char>(file)
, istream_iterator<unsigned char>()
, back_inserter(data));

cout<<"read "<<data.size()<<"bytes of data"<< endl;

file.clear();
file.seekg(0,ios::beg);
ostringstream oss; [snip]

##########################

Notice the second output is larger than the first.

That's probably because the ostringstream does some text mode shenanigans;
although I haven't checked.

Sorry, I forgot one important point.
$ls -l binio-3.3.5
-rwxr-xr-x 1 hattons users 40830 2005-07-28 06:22 binio-3.3.5

###### note the file size above, and the second output value below:

$./binio-3.3.5 binio-3.3.5
read 40034bytes of data
read 40830bytes of data

As I understand things, when I do `file.rdbuf() >> oss' I am getting a raw
stream. That, too, may be implementation defined for all I know.

? The _default_ is translation. That is, the default is the overhead &
performance hit (+ other much more evil effects) you're mentioning.

So is it the case that ios::binary may still not produce real binary
streams? That is, the stream could still act in some ways like a text
stream, e.g., eof?
There is no §21.2.1.
TC++PL(SE).
 
P

P.J. Plauger

§27.4.2.1.4 Type ios_base::eek:penmode
Says this about the std::ios::binary openmode flag:
*binary*: perform input and output in binary mode (as opposed to text
mode)

And that is basically _all_ it says about it. What the heck does the
binary
flag mean?

"Binary" and "text" are both terms of art from the C Standard, which
is included by reference in the C++ Standard. Binary I/O is byte
transparent, with the possible exception of padding NUL bytes.
Text I/O endeavors to translate between internal newline-delimited
text lines and however the system commonly represents text outside
the program.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
P

P.J. Plauger

* Steven T. Hatton:

Basically it means that the i/o functions should not do any translation
to/from external representation of the data: they should behave as they
really should have behaved by default, but unfortunately do not.

For example, a Windows text file has each line terminated by carriage
return
+ linefeed, as opposed to just linefeed in Unix, and in text mode '\n' is
translated accordingly for output, and these sequences are translated to
'\n' on input -- in practice '\r' is carriage return and '\n' is
linefeed.

Also, for example, in a Windows text file Ctrl Z denotes end-of-file.
That's useful for including a short descriptive text snippet at the start
of
a binary file, but I suspect it was originally a misunderstanding of the
Unix shell command to send the current line immediately (which for an
empty
line means zero bytes, which in Unix indicates end-of-file).

No, the usage originated in early systems that couldn't describe the
length of a file to the nearest byte. A CTL-Z delimited the logical
end of text, so your program didn't read trailing garbage.
So in text
mode in Windows, a Ctrl Z might be translated to end-of-file on input.
Interestingly the C++ iostream utilities are so extremely badly designed
that you can't make a simple
copy-standard-input-to-standard-output-exactly
program using only the standard C++ library, on systems where this is
meaningful but text translation occurs in text mode.

Well, yes you can. Open files in binary mode and use read/write.
Of course, the religious C++'ers maintain that that shouldn't be possible
anyway because you can't do it on, say, a mobile phone, where C++ could be
used for something, but then they forget that i/o is there for a reason.

Nonsense.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
P

P.J. Plauger

Is that per the Standard, or "implementation imposed"?

The C and C++ Standards both require that text mode I/O do whatever
is necessary to convert between the universal internal form of text
streams and whatever the execution environment requires instead.
I know that what you
describe is what I typically think of when I think of binary I/O. For
example, in ancient times we used to have to explicitly tell ftp to use
binary mode transfer.

And you still do, sometimes, if your FTP utility can't make an
intelligent guess.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
P

P.J. Plauger

I'm now wondering if I really understood. If I read "characters" from a
std::istream, it goes into an error state when it hits an EOF. That's why
stuff like this works (when it works)

std::vector<float_pair> positions
(istream_iterator<float_pair> (file),
(istream_iterator<float_pair> ()));

I don't believe binary files are terminated by a special character, but I
could be wrong (again).

Correct. But the I/O subsystem on every OS has *some* way to tell
you when you run out of input characters.
...

I suspect there are "political" reasons things turned out the way they
did.

Nonsense. In the early 1970s, Unix pioneered the notion of a universal
format for text streams, by pushing any needed mappings out to the
device drivers. That text stream model became an integral part of C,
which first evolved under Unix. In the late 1970s and early 1980s,
Whitesmiths, Ltd. ported C to several dozen operating systems. We
elaborated the text/binary I/O model as a way of preserving both the
universal text stream format and the transparent text stream, as
needed. All that technology was captured in the C Standard in the
mid 1980s. It was then incorporated by reference in the C++ Standard
in the mid 1990s. It's there because it works.
I really don't know how much of a performance hit it would be if certain
platforms had to do some extra endian shuffling. I do believe the lack of
real binary I/O in the Standard Library is an unexpected inconvenience.

Might be, if there were such a lack.
Stroustrup bluntly states that binary I/O is beyond the scope of C++
Standard, and beyond the scope of TC++PL(SE). §21.2.1

Stroustrup is not nearly as familiar with the Standard C++ library
as he is with the language he invented.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
P

P.J. Plauger

And they call it a "standard"? :/

Yes they do. The C Standard describes how to impose order over a
diverse range of operating systems. You can describe practically
every car made today as having a steering wheel, an accelerator,
and a brake -- if you want to emphasize what's standard about
them. Or you can discuss at great length the different kinds of
linkages and braking systems -- if you want to emphasize how
they differ. Depends on your "political" goal, I suppose.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
P

P.J. Plauger

* P.J. Plauger:

That should be only a few lines; could you please present the code?

#include <fstream>

int main(int argc, char **argv)
{ // copy a file
if (2 < argc)
{ // copy argv[1] to argv[2] transparently
std::ifstream ifs(argv[1],
std::ios_base::in | std::ios_base::binary);
std::eek:fstream ofs(argv[2],
std::ios_base::eek:ut | std::ios_base::binary);

ofs << ifs.rdbuf();
}
return (0);
}

(I was wrong about needing read and write.)

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
A

Alf P. Steinbach

* P.J. Plauger:
* Alf P. Steinbach
* P.J. Plauger:

That should be only a few lines; could you please present the code?

#include <fstream>

int main(int argc, char **argv)
{ // copy a file
if (2 < argc)
{ // copy argv[1] to argv[2] transparently
std::ifstream ifs(argv[1],
std::ios_base::in | std::ios_base::binary);
std::eek:fstream ofs(argv[2],
std::ios_base::eek:ut | std::ios_base::binary);

ofs << ifs.rdbuf();
}
return (0);
}

(I was wrong about needing read and write.)

Thanks for the code.

Since I (naturally) don't use iostreams much -- and in fact it's been some
time since I did C++ development -- I learned a new idiom.

And you weren't wrong about read and write: it can be done that way.

What you were wrong about:

The above doesn't copy standard input to standard output. ;-)


Cheers,

- Alf
 
S

Steven T. Hatton

P.J. Plauger said:
* P.J. Plauger:

That should be only a few lines; could you please present the code?

#include <fstream>

int main(int argc, char **argv)
{ // copy a file
if (2 < argc)
{ // copy argv[1] to argv[2] transparently
std::ifstream ifs(argv[1],
std::ios_base::in | std::ios_base::binary);
std::eek:fstream ofs(argv[2],
std::ios_base::eek:ut | std::ios_base::binary);

ofs << ifs.rdbuf();
}
return (0);
}

(I was wrong about needing read and write.)

Josuttis provides a similar example.

What I would like to know is how to get in iterator over a buffer such as
std::basic_filebuf, so that I can do something like:


copy(istream_iterator<unsigned char>(file_buf.eback())
, istream_iterator<unsigned char>(file_buf.egptr())
, back_inserter(data));

Which I can't do without inheriting from the buffer. That makes binary I/O
seem inconsistent with the rest of the library.
 
S

Steven T. Hatton

Alf said:
Thanks for the code.

Since I (naturally) don't use iostreams much -- and in fact it's been
some
time since I did C++ development -- I learned a new idiom.

And you weren't wrong about read and write: it can be done that way.

What you were wrong about:

The above doesn't copy standard input to standard output. ;-)


Cheers,

- Alf

Dose this potentially modify the stream data? If so, where?

/* The following code example is taken from the book
* "The C++ Standard Library - A Tutorial and Reference"
* by Nicolai M. Josuttis, Addison-Wesley, 1999
*
* (C) Copyright Nicolai M. Josuttis 1999.
* Permission to copy, use, modify, sell and distribute this software
* is granted provided this copyright notice appears in all copies.
* This software is provided "as is" without express or implied
* warranty, and with no claim as to its suitability for any purpose.
*/
#include <iostream>

int main ()
{
// copy all standard input to standard output
std::cout << std::cin.rdbuf();
}
 
S

Steven T. Hatton

Steven said:
P.J. Plauger said:
* P.J. Plauger:

Interestingly the C++ iostream utilities are so extremely badly
designed
that you can't make a simple
copy-standard-input-to-standard-output-exactly
program using only the standard C++ library, on systems where this is
meaningful but text translation occurs in text mode.

Well, yes you can. Open files in binary mode and use read/write.

That should be only a few lines; could you please present the code?

#include <fstream>

int main(int argc, char **argv)
{ // copy a file
if (2 < argc)
{ // copy argv[1] to argv[2] transparently
std::ifstream ifs(argv[1],
std::ios_base::in | std::ios_base::binary);
std::eek:fstream ofs(argv[2],
std::ios_base::eek:ut | std::ios_base::binary);

ofs << ifs.rdbuf();
}
return (0);
}

(I was wrong about needing read and write.)

Josuttis provides a similar example.

What I would like to know is how to get in iterator over a buffer such as
std::basic_filebuf, so that I can do something like:


copy(istream_iterator<unsigned char>(file_buf.eback())
, istream_iterator<unsigned char>(file_buf.egptr())
, back_inserter(data));

Which I can't do without inheriting from the buffer. That makes binary
I/O seem inconsistent with the rest of the library.

Actually I believe that should be more like:

copy(file_buf.eback(), file_buf.egptr(), back_inserter(data));
 
A

Alf P. Steinbach

* Steven T. Hatton:
Dose this potentially modify the stream data?
Yes.


If so, where?

Well, I don't really care exactly where -- when you know the car has
square wheels it doesn't really matter exactly what prevents the motor from
starting and the door from opening, so I've never been interested in that.

/* The following code example is taken from the book
* "The C++ Standard Library - A Tutorial and Reference"
* by Nicolai M. Josuttis, Addison-Wesley, 1999
*
* (C) Copyright Nicolai M. Josuttis 1999.
* Permission to copy, use, modify, sell and distribute this software
* is granted provided this copyright notice appears in all copies.
* This software is provided "as is" without express or implied
* warranty, and with no claim as to its suitability for any purpose.
*/
#include <iostream>

int main ()
{
// copy all standard input to standard output
std::cout << std::cin.rdbuf();
}

With MSVC 7.1 under Windows XP Professional:

P:\> dir | find "exe"
28.07.2005 15:05 233 472 vc_project.exe

P:\> vc_project <vc_project.exe >x

P:\> dir | find "x"
28.07.2005 15:05 233 472 vc_project.exe
28.07.2005 15:09 4 484 x

P:\> _
 
S

Steven T. Hatton

Alf said:
* Steven T. Hatton:

Well, I don't really care exactly where -- when you know the car has
square wheels it doesn't really matter exactly what prevents the motor
from starting and the door from opening, so I've never been interested in
that.



With MSVC 7.1 under Windows XP Professional:

P:\> dir | find "exe"
28.07.2005 15:05 233 472 vc_project.exe

P:\> vc_project <vc_project.exe >x

P:\> dir | find "x"
28.07.2005 15:05 233 472 vc_project.exe
28.07.2005 15:09 4 484 x

P:\> _

Well, I _am_ interested because I am (was) under the impression that
std::cout.rdbuf() would give me raw data. but now, it seems as if it might
kill spiders, or something like that. The standard streams may actually be
bad examples since they are very OS specific.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,439
Members
44,829
Latest member
PIXThurman

Latest Threads

Top