Howto derive class from std::ifstream which counts newline characters

Discussion in 'C++' started by MSP, Sep 8, 2009.

  1. MSP

    MSP Guest

    Hello everybody,

    I am reading a text file (in a format created by myself), which
    contains inline-data maintained by a third party library. Say, there
    is a class, with some read-method as follows:

    class CForeignClass
    {
    public:
    bool Read(std::istream& is);
    }


    To read the file, I open a std::ifstream and whenever I encounter the
    foreign data, I simply hand the istream reference over to the foreign
    class.

    std::ifstream f(path);

    // read data owned by me
    ...

    if (bForeignDataEncountered)
    {
    // read foreign data
    CForeignClass foreignClass;
    foreignClass.Read(f);
    }

    // read more data owned by me


    While doing this, I would like to keep track of the number of lines
    read by the foreign class. Is there a portable or recommended way to
    do this by deriving my own class from std::ifstream and overwriting
    some virtual method (or similar by replacing the standard input buffer
    by a modified one) ?

    Thanks in advance,
    Matthias
     
    MSP, Sep 8, 2009
    #1
    1. Advertising

  2. MSP

    Francesco Guest

    Re: Howto derive class from std::ifstream which counts newlinecharacters

    On Sep 8, 10:34 am, MSP <> wrote:
    > Hello everybody,
    >
    > I am reading a text file (in a format created by myself), which
    > contains inline-data maintained by a third party library. Say, there
    > is a class, with some read-method as follows:
    >
    > class CForeignClass
    > {
    > public:
    > bool Read(std::istream& is);
    > }
    >
    > To read the file, I open a std::ifstream and whenever I encounter the
    > foreign data, I simply hand the istream reference over to the foreign
    > class.
    >
    > std::ifstream f(path);
    >
    > // read data owned by me
    > ...
    >
    > if (bForeignDataEncountered)
    > {
    > // read foreign data
    > CForeignClass foreignClass;
    > foreignClass.Read(f);
    > }
    >
    > // read more data owned by me
    >
    > While doing this, I would like to keep track of the number of lines
    > read by the foreign class. Is there a portable or recommended way to
    > do this by deriving my own class from std::ifstream and overwriting
    > some virtual method (or similar by replacing the standard input buffer
    > by a modified one) ?


    Hi Matthias,
    I didn't try any of the following, I'm just reasoning about the issue
    you presented.

    I suppose you cannot modify the foreignClass::Read() method -
    otherwise you could make it return the number of lines read instead of
    that bool.

    But can you inspect its implementation to see how exactly it reads the
    data from istream?

    One solution could be to save the result of tellg() on that istream
    before calling foreignClass::Read(), then call tellg() again, compute
    the difference of those two pointers to understand how many characters
    have been read and then parse that amount of data by yourself to count
    newline characters - you'd have to reopen the file or create a copy of
    it while in memory or something else like this.

    Doing the above - assuming it works - you'd have no need to overload
    istream.

    In the other case, if you were to ensure that any read on that istream
    keeps track of the lines read you should, I think, overload all read
    methods of istream.

    I'm going to do some testing for my very own curiosity.

    Hope the above 2 cents help anyway.

    Best regards,
    Francesco
     
    Francesco, Sep 8, 2009
    #2
    1. Advertising

  3. MSP

    MSP Guest

    Re: Howto derive class from std::ifstream which counts newlinecharacters

    Hi Francesco,

    first of all, thanks for the quick reply.

    > I suppose you cannot modify the foreignClass::Read() method -
    > otherwise you could make it return the number of lines read instead of
    > that bool.


    To be precise: the foreign library is OpenCASCADE, an open source
    modeller library (abbreviated OCC in what follows). So I have full
    source code. Theoretically, it would be possible for me to modify the
    OCC-source code, but practically, that's not an option.

    > But can you inspect its implementation to see how exactly it reads the
    > data from istream?


    Reading is done in the typical C++-style, using a lot of <<-operators
    (both OCC-defined and standard STL operators).

    > One solution could be to save the result of tellg() on that istream
    > before calling foreignClass::Read(), then call tellg() again, compute
    > the difference of those two pointers to understand how many characters
    > have been read and then parse that amount of data by yourself to count
    > newline characters - you'd have to reopen the file or create a copy of
    > it while in memory or something else like this.
    >
    > Doing the above - assuming it works - you'd have no need to overload
    > istream.


    I already considered this option. But the inline-date is some 3D-model
    so it tends to be rather large. So I would prefer not to scan this
    large chunk of data twice.

    >
    > In the other case, if you were to ensure that any read on that istream
    > keeps track of the lines read you should, I think, overload all read
    > methods of istream.


    Overloading all read methods is rather tedious and maybe error prone.
    Instead, I was whether there is some central low-level method deep
    down in the implementation of the stream (or the stream buffer), which
    I could overwrite. For example, a callback which is triggered whenever
    the stream buffer underflows and which filles the buffer with the new
    data from the file.

    I hope this clarifies what I intend to do.

    Regards,
    Matthias
     
    MSP, Sep 8, 2009
    #3
  4. MSP

    MSP Guest

    Re: Howto derive class from std::ifstream which counts newlinecharacters

    Hi again,

    I fixed two typos in the last paragraph of my previous post:

    Overloading all read methods is rather tedious and maybe error prone.
    Instead, I was wondering whether there is some central low-level
    method deep
    down in the implementation of the stream (or the stream buffer),
    which
    I could overwrite. For example, a callback which is triggered
    whenever
    the stream buffer underflows and which fills the buffer with the new
    data from the file.
     
    MSP, Sep 8, 2009
    #4
  5. MSP

    Francesco Guest

    Re: Howto derive class from std::ifstream which counts newlinecharacters

    On Sep 8, 2:43 pm, MSP <> wrote:
    > Hi Francesco,
    >
    > first of all, thanks for the quick reply.
    >
    > > I suppose you cannot modify the foreignClass::Read() method -
    > > otherwise you could make it return the number of lines read instead of
    > > that bool.

    >
    > To be precise: the foreign library is OpenCASCADE, an open source
    > modeller library (abbreviated OCC in what follows). So I have full
    > source code. Theoretically, it would be possible for me to modify the
    > OCC-source code, but practically, that's not an option.
    >
    > > But can you inspect its implementation to see how exactly it reads the
    > > data from istream?

    >
    > Reading is done in the typical C++-style, using a lot of <<-operators
    > (both OCC-defined and standard STL operators).
    >
    > > One solution could be to save the result of tellg() on that istream
    > > before calling foreignClass::Read(), then call tellg() again, compute
    > > the difference of those two pointers to understand how many characters
    > > have been read and then parse that amount of data by yourself to count
    > > newline characters - you'd have to reopen the file or create a copy of
    > > it while in memory or something else like this.

    >
    > > Doing the above - assuming it works - you'd have no need to overload
    > > istream.

    >
    > I already considered this option. But the inline-date is some 3D-model
    > so it tends to be rather large. So I would prefer not to scan this
    > large chunk of data twice.
    >
    >
    >
    > > In the other case, if you were to ensure that any read on that istream
    > > keeps track of the lines read you should, I think, overload all read
    > > methods of istream.

    >
    > Overloading all read methods is rather tedious and maybe error prone.
    > Instead, I was whether there is some central low-level method deep
    > down in the implementation of the stream (or the stream buffer), which
    > I could overwrite. For example, a callback which is triggered whenever
    > the stream buffer underflows and which filles the buffer with the new
    > data from the file.
    >
    > I hope this clarifies what I intend to do.


    Yes, now it's clearer.

    I think you'll have do dig into the calls of your std::istream
    implementation and find that final function that actually does the
    "dirty" work ;-)

    The advantage of overriding all read methods of std::istream is that
    your derived class will then be portable, while directly touching the
    implementation of std::istream isn't portable at all - also, this
    should be a "don't do it", decisively.

    Both the tellg() solution and overriding std::istream operators will
    lead to parsing of data twice - as you said for the first case, and as
    I'm realizing now for the second case.

    If you're really going to modify some code which doesn't directly
    belong to your application I think it's better to modify the OCC code.

    As a side question: why you want to know how many lines have been
    parsed by OCC? I think that maybe you're using this datum for
    something that could be achieved in some other way which doesn't
    oblige you to "interfere" with the OCC parse.

    Meanwhile, let's hope for someone dropping in and giving a working,
    efficient and portable solution. There are a lot of wizards hanging
    out there ;-)

    Have good coding,
    Francesco
     
    Francesco, Sep 8, 2009
    #5
  6. MSP

    Jerry Coffin Guest

    In article <f8b9fb45-3bd2-47da-ab7f-
    >,
    says...

    [ ... using a library that reads some data from an istream ]

    > While doing this, I would like to keep track of the number of lines
    > read by the foreign class. Is there a portable or recommended way to
    > do this by deriving my own class from std::ifstream and overwriting
    > some virtual method (or similar by replacing the standard input buffer
    > by a modified one) ?


    A filtering streambuf can do what you want. James Kanze has (or at
    least used to have, and there's undoubtedly a copy still around) a
    web page describing the basics of creating a filtering streambuf.
    Jonathan Turkanis also wrote an iostreams library that's included in
    Boost that makes it fairly simple to write filtering streambufs (as
    long as you don't object to Boost, of course).

    --
    Later,
    Jerry.
     
    Jerry Coffin, Sep 8, 2009
    #6
  7. MSP

    MSP Guest

    Re: Howto derive class from std::ifstream which counts newlinecharacters

    Hi Francesco,


    > Both the tellg() solution and overriding std::istream operators will
    > lead to parsing of data twice - as you said for the first case, and as
    > I'm realizing now for the second case.


    I noticed this fact, too. But there is a slight difference between the
    two cases, which theoretically could make a difference in performance:
    In the second case the two scans of the same data byte occur in quick
    succession, whereas in the first case (using tellg()), a lot of data
    is scanned in between. As a result, it is possible that the same data
    has to be mapped from disk into memory twice. (Note that the 3D data
    tends to be very large.) However, I am not an expert in these matters
    and also I have no profiling tools to measure performance exactly.
    Maybe nowadays this is not such an issue anyway, since computers have
    a lot of memory and cache.


    > As a side question: why you want to know how many lines have been
    > parsed by OCC? I think that maybe you're using this datum for
    > something that could be achieved in some other way which doesn't
    > oblige you to "interfere" with the OCC parse.


    Very simple: the native (=non-foreign) part of the data file is very
    human-readable, it is parsed by a yacc-parser and when it encounters
    an error it issues a meaningful error message with a line number. The
    OCC data is not at the end of the file, so errors following the OCC
    data are currently reported with incorrect line numbers. (Of course,
    there is a simple solution for this: place the OCC data at the end!
    Probably, I will do this. But since I am a curious person, the problem
    of intercepting the newline characters interests me nevertheless.)

    >
    > Meanwhile, let's hope for someone dropping in and giving a working,
    > efficient and portable solution. There are a lot of wizards hanging
    > out there ;-)
    >
    > Have good coding,
    > Francesco- Zitierten Text ausblenden -
    >



    For the moment, thanks a lot for your interest in my problem and the
    good discussion. If I'll come up with an 'ingenious' solution, I will
    let you know.

    Regards,
    Matthias
     
    MSP, Sep 9, 2009
    #7
  8. MSP

    MSP Guest

    Re: Howto derive class from std::ifstream which counts newlinecharacters

    On 8 Sep., 18:13, Jerry Coffin <> wrote:
    > In article <f8b9fb45-3bd2-47da-ab7f-
    > >,
    > says...
    >
    > [ ... using a library that reads some data from an istream ]
    >
    > > While doing this, I would like to keep track of the number of lines
    > > read by the foreign class. Is there a portable or recommended way to
    > > do this by deriving my own class from std::ifstream and overwriting
    > > some virtual method (or similar by replacing the standard input buffer
    > > by a modified one) ?

    >
    > A filtering streambuf can do what you want. James Kanze has (or at
    > least used to have, and there's undoubtedly a copy still around) a
    > web page describing the basics of creating a filtering streambuf.
    > Jonathan Turkanis also wrote an iostreams library that's included in
    > Boost that makes it fairly simple to write filtering streambufs (as
    > long as you don't object to Boost, of course).
    >
    > --
    >     Later,
    >     Jerry.



    Hi Jerry,
    thank you for pointing out Boost (www.boost.org) to me. I didn't know
    it yet. Sounds like an interesting location to look at. Is there any
    reason why someone could object it?

    Regards,
    Matthias
     
    MSP, Sep 9, 2009
    #8
  9. Re: Howto derive class from std::ifstream which counts newline characters

    * MSP:
    >
    > thank you for pointing out Boost (www.boost.org) to me. I didn't know
    > it yet. Sounds like an interesting location to look at. Is there any
    > reason why someone could object it?


    Yes.

    First, Boost is large.

    Second, there is a versioning problem with use of any external library.

    Third, using Boost runs squarely against the NIH principle (Not Invented Here,
    don't use it). Some companies have as policy to not use third-party libraries at
    all. At least not free ones.

    Fourth, Boost is based on exceptions for failure reporting. In a setting where
    exceptions are not used that might be a problem. There's probably a technical
    solution for that, but I don't know.

    Fifth, although Boost's license lets you do just about anything you can do with
    the C++ standard library, it might be difficult to convince someone that it's
    really safe from a business perspective. Before you know it Dave Abrahams might
    be knocking on the door with his gun-slinging Boost lawyer! :) Or at least, the
    wrong kind of manager might think that that is a distinct possibility.


    Cheers & hth.,

    - Alf
     
    Alf P. Steinbach, Sep 9, 2009
    #9
  10. MSP

    Jerry Coffin Guest

    Re: Howto derive class from std::ifstream which counts newline characters

    In article <80eafe42-c3ac-4fa6-8e75-964ea79f3679
    @k39g2000yqe.googlegroups.com>, says...

    [ ... ]

    > thank you for pointing out Boost (www.boost.org) to me. I didn't
    > know it yet. Sounds like an interesting location to look at. Is
    > there any reason why someone could object it?


    Apparently a few, since some people object to it.

    The first obvious object is that it's big -- downright huge, truth be
    known. Although it's not (exactly) commercial in itself, it has a lot
    of the same kinds of things as a commercial offering, such as extra
    code to work around bugs in a large number of compilers. This not
    only increases bulk, but often hurts readability.

    Second, Boost often takes on problems many people wouldn't even
    consider -- and to do that, includes some code that's extremely
    difficult to comprehend. In many cases, code that _uses_ the library
    is pretty simple and straightforward, but reading the library code
    itself can be a mind-bending experience (e.g. spirit and expressive).

    Third, since much of it uses templates heavily (and in ways compiler
    authors probably didn't plan for), some boost code compiles quite
    slowly. Any more than a tiny parser written with Spirit can tax many
    (especially older) compilers right to, and sometimes beyond, their
    limits.

    Finally, rather than being simply a collection of pre-written, pre-
    tested (etc.) code about like what you'd probably write yourself if
    you had time, much of Boost attempts to provide the highest level of
    generality and abstraction possible. This often requires people to
    sit back and re-think how they approach a problem in general.
    Especially if you already have a large body of existing code, it can
    be difficult to incorporate a library that requires a large,
    fundamental change in how you approach a problem. Even without
    existing code, existing notions of how to approach a problem can be
    equally difficult to overcome. Along with requiring thought, this
    places rather higher requirements on maintenance coders -- you often
    need direct knowledge of a specific library and its conventions
    before you can understand code at all.

    --
    Later,
    Jerry.
     
    Jerry Coffin, Sep 9, 2009
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. gyan
    Replies:
    3
    Views:
    364
    Ian Collins
    Aug 23, 2006
  2. Assertor
    Replies:
    5
    Views:
    391
  3. Markus Dehmann
    Replies:
    2
    Views:
    641
    Abhishek Padmanabh
    Jun 8, 2008
  4. , India
    Replies:
    3
    Views:
    2,866
    James Kanze
    Nov 13, 2010
  5. mathieu
    Replies:
    12
    Views:
    1,242
    Jeff Flinn
    May 25, 2011
Loading...

Share This Page