streambuf in binary mode

S

smith4894

Hello all,

I'm working on writing my own streambuf classes (to use in my custom
ostream/isteam classes that will handle reading/writing data to a
mmap'd file).

When reading from the mmap file, I essentially have a char buffer in my
streambuf class, that I'm registering with setp(). on an overflow()
call, I simply copy the contents of the buffer into the mmap'd file via
memcpy().

If I want to use this to write binary data via the streambuf classes
ie, are there any special considerations I need to be aware of in my
streambuf classes? Do I need to set any special flags to indicate that
the data is in binary mode perhaps? Any special precautions I need to
take, in overflow(int_type) for example?

My setup seems to work with binary data *MOST* of the time, however
there are rare occasions when there is inconsistency with the data i'm
writing and reading...

Any advice, comments would be much appreciated


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
K

kanze

I'm working on writing my own streambuf classes (to use in my
custom ostream/isteam classes that will handle reading/writing
data to a mmap'd file).
When reading from the mmap file, I essentially have a char
buffer in my streambuf class, that I'm registering with
setp(). on an overflow() call, I simply copy the contents of
the buffer into the mmap'd file via memcpy().

That's not what I understand by mmap'd. I'd set the pointers
directly into the mmap'd file, and not use any additional
buffer. (Note that, at least under Unix, an mmap'd file cannot
grow. I have implemented a mmap'd streambuf in which overflow
unmapped the file, increased its size with truncate, and then
remapped it. Close could have truncated it to the last byte
actually written, but that wasn't necessary in my context.)
If I want to use this to write binary data via the streambuf
classes ie, are there any special considerations I need to be
aware of in my streambuf classes? Do I need to set any special
flags to indicate that the data is in binary mode perhaps? Any
special precautions I need to take, in overflow(int_type) for
example?

The binary option may be defined in ios_base, but it has no
meaning outside of std::basic_filebuf... or a user defined
streambuf, if the user so wants. In practice, a mmap'd
streambuf can only be used for binary files; it makes no sense
otherwise. Under Unix, of course, you can ignore the
distinction, because binary files and text files are identical.
So it's really up to you what you want to do: if you're only
targetting Unix machines, I'd just ignore it; if you also plan
to port to Windows or some other OS, I'd verify it, and reject
any open in which it isn't set.
My setup seems to work with binary data *MOST* of the time,
however there are rare occasions when there is inconsistency
with the data i'm writing and reading...

Not knowing your setup, nor even what OS you are using, it's
hard to say. As I said, mmap'd IO is inherently binary. If you
write mmap'd, and read through a filebuf opened in text mode, or
vice versa, you will have inconsistencies under most OS's.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
M

Maxim Yegorushkin

I'm working on writing my own streambuf classes (to use in my custom
ostream/isteam classes that will handle reading/writing data to a
mmap'd file).

You are doing it with streambuf because you use iostream formatted
input/output (<<,>>), don't you?

If you don't, you could use a much simpler interface, in order not to
deal with all the complexity of implementing std::streambuf. Something
like that:

struct stream
{
virtual ssize_t read(void*, size_t) = 0;
virtual ssize_t write(void const*, size_t) = 0;
};


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
R

Ron Natalie

If I want to use this to write binary data via the streambuf classes
ie, are there any special considerations I need to be aware of in my
streambuf classes? Do I need to set any special flags to indicate that
the data is in binary mode perhaps? Any special precautions I need to
take, in overflow(int_type) for example?
You do know the difference between "binary" mode in an iostream
and "formatted?".

All the binary flag does on the stream is turn off whatever line
end processing might be taking place (on Windows, \r\n -> \n
conversion or vice versa). Formatted refers to using he
functions like << and >> that convert the textual representation
to and from the operand types. Putting the stream in binary
momde doesn't change that. To "binary representations" of
these things, you use the unformatted I/O functions read
and write that just write an specified number of characters
to/from the stream.

All that being said, it is handled in the iostream base classes
and is totally transparent to the stream buffers. The stream
buffers just see a certain number of charT characters that
have already been formatted/new-line mapped.

[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
K

kanze

Ron said:
(e-mail address removed) wrote:
You do know the difference between "binary" mode in an
iostream and "formatted?".
All the binary flag does on the stream is turn off whatever
line end processing might be taking place (on Windows, \r\n ->
\n conversion or vice versa). Formatted refers to using he
functions like << and >> that convert the textual
representation to and from the operand types. Putting the
stream in binary momde doesn't change that. To "binary
representations" of these things, you use the unformatted I/O
functions read and write that just write an specified number
of characters to/from the stream.
All that being said, it is handled in the iostream base
classes and is totally transparent to the stream buffers. The
stream buffers just see a certain number of charT characters
that have already been formatted/new-line mapped.

No. The iostream base classes are totally unaware of the
ios::binary flag, except that they declare it. In fact,
ios::binary is purely a streambuf issue; in the standard
library, the only class that uses it (other than to forward it)
is std::basic_filebuf.

Whether a user defined streambuf should use it or not depends on
what it does, but I suspect that cases where it should are very
rare. All it does is control the mapping between the file
representation and the memory representation of end of file and
end of line. There is already a streambuf class concerned with
reading and writing system files: basic_filebuf, and I can't
really think of a case where you would want another one.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
M

Martin Bonner

kanze said:
Whether a user defined streambuf should use it [ios::binary]
or not depends on what it does, but I suspect that cases
where it should are very rare. Agreed.

All it does is control the mapping between the file
representation and the memory representation of end of file and
end of line. There is already a streambuf class concerned with
reading and writing system files: basic_filebuf, and I can't
really think of a case where you would want another one.
Surely whenever the streambuf needs to map between an internal format,
and an external binary format?

Examples I can think of are:
- writing multiline text into a windows text box (where you need \n
-> \r\n conversion)
- writing text an HTTP GET request (which again I think needs \n >
CR, LF conversion)
- writing text to some network protocol which needs lines terminated
by ASCII LF (some Mac compilers use(d) '\n'==ASCII CR because that
allows ios::text to be a no-op for filebuf).

But, I agree it is rare.


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 
S

Seungbeom Kim

Martin said:
Surely whenever the streambuf needs to map between an internal format,
and an external binary format?

Examples I can think of are:
- writing multiline text into a windows text box (where you need \n
-> \r\n conversion)
- writing text an HTTP GET request (which again I think needs \n >
CR, LF conversion)
- writing text to some network protocol which needs lines terminated
by ASCII LF (some Mac compilers use(d) '\n'==ASCII CR because that
allows ios::text to be a no-op for filebuf).

But, I agree it is rare.

If all or most of the rare cases where a streambuf class other than
basic_filebuf is needed are concerned with which character sequence to
use for end-of-line, maybe we could make it a parameter for a common
class so that we didn't have to reinvent the wheel for each case.
Has there been any discussion or proposal in the past?
 
K

kanze

Martin said:
kanze said:
Whether a user defined streambuf should use it [ios::binary]
or not depends on what it does, but I suspect that cases
where it should are very rare.
All it does is control the mapping between the file
representation and the memory representation of end of file and
end of line. There is already a streambuf class concerned with
reading and writing system files: basic_filebuf, and I can't
really think of a case where you would want another one.
Surely whenever the streambuf needs to map between an internal
format, and an external binary format?

You mean between the internal text format (stream of characters,
with end of line indicated by the character '\n') and an
external text format. The role of the binary flag is to turn
off a "default" mapping. (Sort of---it doesn't turn off the
locale specific mapping in filebuf, which makes its actual
semantics rather vague.)

Note that at present, it is *only* used in filebuf and the
[io]fstream; it is not used in the basic iostream idioms. This
means that anyone using it is aware of the derived type (filebuf
or the [io]fstream decorators). If you design a new streambuf
type, which needs different modes, it's up to you whether you
reuse std::ios::binary, or define your own mode options. In
general, I think I'd use std::ios::binary if the default mode
corresponded to some sort of text mapping (say converting lines
into separate records), and the other mode were something more
or less transparent.
Examples I can think of are:
- writing multiline text into a windows text box (where you need \n
-> \r\n conversion)

Text formatting, in sum. But do you ever want to provide the
transparent mode?
- writing text an HTTP GET request (which again I think needs \n >
CR, LF conversion)

At a lower level. HTTP (application layer) is based on Internet
ASCII (presentation layer), at least in the header. In this
case, you do need the two modes, *but* you need to change them
dynamically---one mode for the header and other text data, and
the other for binary data.

Arguably, you might want a different mode for every filetype
handled. And of course, you'd want to ensure standard ASCII for
the header, but an encoding specified in the header for the
remaining text. Except that if the remaining text is HTML---a
relatively frequent case---it's also possible that the encoding
be specified in the <head>...</head> section of the document.

There are different ways of handling this, but a on/off switch
when opening the file isn't sufficient. (Of course, if all you
want to handle is the GET command, then there is only a header,
and you map '\n' to CRLF, without an option to not do so.)
- writing text to some network protocol which needs lines terminated
by ASCII LF (some Mac compilers use(d) '\n'==ASCII CR because that
allows ios::text to be a no-op for filebuf).

Again, either you don't want to support transparence, or you'll
likely have to support changing modes dynamically.

--
James Kanze GABI Software
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


[ See http://www.gotw.ca/resources/clcm.htm for info about ]
[ comp.lang.c++.moderated. First time posters: Do this! ]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top