streambuf in binary mode

Discussion in 'C++' started by smith4894@excite.com, Aug 23, 2006.

  1. Guest

    Hello all,

    I'm working on writing my own streambuf classes (to use in my custom
    ostream/isteam classes that will handle reading/writing data to a
    mmap'd file).

    When reading from the mmap file, I essentially have a char buffer in my
    streambuf class, that I'm registering with setp(). on an overflow()
    call, I simply copy the contents of the buffer into the mmap'd file via
    memcpy().

    If I want to use this to write binary data via the streambuf classes
    ie, are there any special considerations I need to be aware of in my
    streambuf classes? Do I need to set any special flags to indicate that
    the data is in binary mode perhaps? Any special precautions I need to
    take, in overflow(int_type) for example?

    My setup seems to work with binary data *MOST* of the time, however
    there are rare occasions when there is inconsistency with the data i'm
    writing and reading...

    Any advice, comments would be much appreciated


    [ See http://www.gotw.ca/resources/clcm.htm for info about ]
    [ comp.lang.c++.moderated. First time posters: Do this! ]
     
    , Aug 23, 2006
    #1
    1. Advertising

  2. kanze Guest

    wrote:

    > I'm working on writing my own streambuf classes (to use in my
    > custom ostream/isteam classes that will handle reading/writing
    > data to a mmap'd file).


    > When reading from the mmap file, I essentially have a char
    > buffer in my streambuf class, that I'm registering with
    > setp(). on an overflow() call, I simply copy the contents of
    > the buffer into the mmap'd file via memcpy().


    That's not what I understand by mmap'd. I'd set the pointers
    directly into the mmap'd file, and not use any additional
    buffer. (Note that, at least under Unix, an mmap'd file cannot
    grow. I have implemented a mmap'd streambuf in which overflow
    unmapped the file, increased its size with truncate, and then
    remapped it. Close could have truncated it to the last byte
    actually written, but that wasn't necessary in my context.)

    > If I want to use this to write binary data via the streambuf
    > classes ie, are there any special considerations I need to be
    > aware of in my streambuf classes? Do I need to set any special
    > flags to indicate that the data is in binary mode perhaps? Any
    > special precautions I need to take, in overflow(int_type) for
    > example?


    The binary option may be defined in ios_base, but it has no
    meaning outside of std::basic_filebuf... or a user defined
    streambuf, if the user so wants. In practice, a mmap'd
    streambuf can only be used for binary files; it makes no sense
    otherwise. Under Unix, of course, you can ignore the
    distinction, because binary files and text files are identical.
    So it's really up to you what you want to do: if you're only
    targetting Unix machines, I'd just ignore it; if you also plan
    to port to Windows or some other OS, I'd verify it, and reject
    any open in which it isn't set.

    > My setup seems to work with binary data *MOST* of the time,
    > however there are rare occasions when there is inconsistency
    > with the data i'm writing and reading...


    Not knowing your setup, nor even what OS you are using, it's
    hard to say. As I said, mmap'd IO is inherently binary. If you
    write mmap'd, and read through a filebuf opened in text mode, or
    vice versa, you will have inconsistencies under most OS's.

    --
    James Kanze GABI Software
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


    [ See http://www.gotw.ca/resources/clcm.htm for info about ]
    [ comp.lang.c++.moderated. First time posters: Do this! ]
     
    kanze, Aug 23, 2006
    #2
    1. Advertising

  3. wrote:

    > I'm working on writing my own streambuf classes (to use in my custom
    > ostream/isteam classes that will handle reading/writing data to a
    > mmap'd file).


    You are doing it with streambuf because you use iostream formatted
    input/output (<<,>>), don't you?

    If you don't, you could use a much simpler interface, in order not to
    deal with all the complexity of implementing std::streambuf. Something
    like that:

    struct stream
    {
    virtual ssize_t read(void*, size_t) = 0;
    virtual ssize_t write(void const*, size_t) = 0;
    };


    [ See http://www.gotw.ca/resources/clcm.htm for info about ]
    [ comp.lang.c++.moderated. First time posters: Do this! ]
     
    Maxim Yegorushkin, Aug 23, 2006
    #3
  4. Ron Natalie Guest

    wrote:

    > If I want to use this to write binary data via the streambuf classes
    > ie, are there any special considerations I need to be aware of in my
    > streambuf classes? Do I need to set any special flags to indicate that
    > the data is in binary mode perhaps? Any special precautions I need to
    > take, in overflow(int_type) for example?
    >

    You do know the difference between "binary" mode in an iostream
    and "formatted?".

    All the binary flag does on the stream is turn off whatever line
    end processing might be taking place (on Windows, \r\n -> \n
    conversion or vice versa). Formatted refers to using he
    functions like << and >> that convert the textual representation
    to and from the operand types. Putting the stream in binary
    momde doesn't change that. To "binary representations" of
    these things, you use the unformatted I/O functions read
    and write that just write an specified number of characters
    to/from the stream.

    All that being said, it is handled in the iostream base classes
    and is totally transparent to the stream buffers. The stream
    buffers just see a certain number of charT characters that
    have already been formatted/new-line mapped.

    [ See http://www.gotw.ca/resources/clcm.htm for info about ]
    [ comp.lang.c++.moderated. First time posters: Do this! ]
     
    Ron Natalie, Aug 23, 2006
    #4
  5. kanze Guest

    Ron Natalie wrote:
    > wrote:


    > > If I want to use this to write binary data via the streambuf
    > > classes ie, are there any special considerations I need to
    > > be aware of in my streambuf classes? Do I need to set any
    > > special flags to indicate that the data is in binary mode
    > > perhaps? Any special precautions I need to take, in
    > > overflow(int_type) for example?


    > You do know the difference between "binary" mode in an
    > iostream and "formatted?".


    > All the binary flag does on the stream is turn off whatever
    > line end processing might be taking place (on Windows, \r\n ->
    > \n conversion or vice versa). Formatted refers to using he
    > functions like << and >> that convert the textual
    > representation to and from the operand types. Putting the
    > stream in binary momde doesn't change that. To "binary
    > representations" of these things, you use the unformatted I/O
    > functions read and write that just write an specified number
    > of characters to/from the stream.


    > All that being said, it is handled in the iostream base
    > classes and is totally transparent to the stream buffers. The
    > stream buffers just see a certain number of charT characters
    > that have already been formatted/new-line mapped.


    No. The iostream base classes are totally unaware of the
    ios::binary flag, except that they declare it. In fact,
    ios::binary is purely a streambuf issue; in the standard
    library, the only class that uses it (other than to forward it)
    is std::basic_filebuf.

    Whether a user defined streambuf should use it or not depends on
    what it does, but I suspect that cases where it should are very
    rare. All it does is control the mapping between the file
    representation and the memory representation of end of file and
    end of line. There is already a streambuf class concerned with
    reading and writing system files: basic_filebuf, and I can't
    really think of a case where you would want another one.

    --
    James Kanze GABI Software
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


    [ See http://www.gotw.ca/resources/clcm.htm for info about ]
    [ comp.lang.c++.moderated. First time posters: Do this! ]
     
    kanze, Aug 23, 2006
    #5
  6. kanze wrote:
    > Whether a user defined streambuf should use it [ios::binary]
    > or not depends on what it does, but I suspect that cases
    > where it should are very rare.

    Agreed.

    > All it does is control the mapping between the file
    > representation and the memory representation of end of file and
    > end of line. There is already a streambuf class concerned with
    > reading and writing system files: basic_filebuf, and I can't
    > really think of a case where you would want another one.

    Surely whenever the streambuf needs to map between an internal format,
    and an external binary format?

    Examples I can think of are:
    - writing multiline text into a windows text box (where you need \n
    -> \r\n conversion)
    - writing text an HTTP GET request (which again I think needs \n >
    CR, LF conversion)
    - writing text to some network protocol which needs lines terminated
    by ASCII LF (some Mac compilers use(d) '\n'==ASCII CR because that
    allows ios::text to be a no-op for filebuf).

    But, I agree it is rare.


    [ See http://www.gotw.ca/resources/clcm.htm for info about ]
    [ comp.lang.c++.moderated. First time posters: Do this! ]
     
    Martin Bonner, Aug 24, 2006
    #6
  7. Martin Bonner wrote:
    > kanze wrote:
    >> All it does is control the mapping between the file
    >> representation and the memory representation of end of file and
    >> end of line. There is already a streambuf class concerned with
    >> reading and writing system files: basic_filebuf, and I can't
    >> really think of a case where you would want another one.

    > Surely whenever the streambuf needs to map between an internal format,
    > and an external binary format?
    >
    > Examples I can think of are:
    > - writing multiline text into a windows text box (where you need \n
    > -> \r\n conversion)
    > - writing text an HTTP GET request (which again I think needs \n >
    > CR, LF conversion)
    > - writing text to some network protocol which needs lines terminated
    > by ASCII LF (some Mac compilers use(d) '\n'==ASCII CR because that
    > allows ios::text to be a no-op for filebuf).
    >
    > But, I agree it is rare.


    If all or most of the rare cases where a streambuf class other than
    basic_filebuf is needed are concerned with which character sequence to
    use for end-of-line, maybe we could make it a parameter for a common
    class so that we didn't have to reinvent the wheel for each case.
    Has there been any discussion or proposal in the past?

    --
    Seungbeom Kim

    [ See http://www.gotw.ca/resources/clcm.htm for info about ]
    [ comp.lang.c++.moderated. First time posters: Do this! ]
     
    Seungbeom Kim, Aug 24, 2006
    #7
  8. kanze Guest

    Martin Bonner wrote:
    > kanze wrote:
    > > Whether a user defined streambuf should use it [ios::binary]
    > > or not depends on what it does, but I suspect that cases
    > > where it should are very rare.


    > Agreed.


    > > All it does is control the mapping between the file
    > > representation and the memory representation of end of file and
    > > end of line. There is already a streambuf class concerned with
    > > reading and writing system files: basic_filebuf, and I can't
    > > really think of a case where you would want another one.


    > Surely whenever the streambuf needs to map between an internal
    > format, and an external binary format?


    You mean between the internal text format (stream of characters,
    with end of line indicated by the character '\n') and an
    external text format. The role of the binary flag is to turn
    off a "default" mapping. (Sort of---it doesn't turn off the
    locale specific mapping in filebuf, which makes its actual
    semantics rather vague.)

    Note that at present, it is *only* used in filebuf and the
    [io]fstream; it is not used in the basic iostream idioms. This
    means that anyone using it is aware of the derived type (filebuf
    or the [io]fstream decorators). If you design a new streambuf
    type, which needs different modes, it's up to you whether you
    reuse std::ios::binary, or define your own mode options. In
    general, I think I'd use std::ios::binary if the default mode
    corresponded to some sort of text mapping (say converting lines
    into separate records), and the other mode were something more
    or less transparent.

    > Examples I can think of are:


    > - writing multiline text into a windows text box (where you need \n
    > -> \r\n conversion)


    Text formatting, in sum. But do you ever want to provide the
    transparent mode?

    > - writing text an HTTP GET request (which again I think needs \n >
    > CR, LF conversion)


    At a lower level. HTTP (application layer) is based on Internet
    ASCII (presentation layer), at least in the header. In this
    case, you do need the two modes, *but* you need to change them
    dynamically---one mode for the header and other text data, and
    the other for binary data.

    Arguably, you might want a different mode for every filetype
    handled. And of course, you'd want to ensure standard ASCII for
    the header, but an encoding specified in the header for the
    remaining text. Except that if the remaining text is HTML---a
    relatively frequent case---it's also possible that the encoding
    be specified in the <head>...</head> section of the document.

    There are different ways of handling this, but a on/off switch
    when opening the file isn't sufficient. (Of course, if all you
    want to handle is the GET command, then there is only a header,
    and you map '\n' to CRLF, without an option to not do so.)

    > - writing text to some network protocol which needs lines terminated
    > by ASCII LF (some Mac compilers use(d) '\n'==ASCII CR because that
    > allows ios::text to be a no-op for filebuf).


    Again, either you don't want to support transparence, or you'll
    likely have to support changing modes dynamically.

    --
    James Kanze GABI Software
    Conseils en informatique orientée objet/
    Beratung in objektorientierter Datenverarbeitung
    9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34


    [ See http://www.gotw.ca/resources/clcm.htm for info about ]
    [ comp.lang.c++.moderated. First time posters: Do this! ]
     
    kanze, Aug 25, 2006
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter Jansson
    Replies:
    1
    Views:
    518
    David Rubin
    Nov 8, 2004
  2. John J Lee
    Replies:
    3
    Views:
    508
    bruno at modulix
    Dec 1, 2005
  3. Edward Loper
    Replies:
    0
    Views:
    491
    Edward Loper
    Aug 7, 2007
  4. John J Lee
    Replies:
    0
    Views:
    537
    John J Lee
    Aug 7, 2007
  5. Christopher Pisz
    Replies:
    2
    Views:
    606
    James Kanze
    Dec 12, 2007
Loading...

Share This Page