Deriving from ifstream

M

mc

Hi all,

I've been trying to get it right for the last few days and wasn't 100%
successful. I trying to add functionality to the ifstream class creating an
le_ifstream derived class (see at the end of this message). I also want to
have some cast of casting from ifstream it le_ifstream. Here's what I'm
trying to do:

void foo(ifstream & ifs)
{
le_ifstream le_ifs(ifs); // build/cast the istream to an
le_ifstream

}

So far, I could only make it work if deriving from iostream (see below).
However, I see weird behavior (e.g. seek pointer get out of wack). I read
from a file 200 Kb in size and after 3 calls to readshort(), the
assert(good()) fails (gcount() returns 0), while this works with C calls or
if I read the data directly using the ifstream class.

Thank you in advance.

MC

------------------------------------------------------------------------------
class le_ifstream : public std::iostream
{
public :
inline le_ifstream(ifstream & _Istr) : iostream(_Istr.rdbuf())
{
}

le_ifstream & read(void * _Ptr, size_t _Count)
{
iostream::read((char *)_Ptr, _Count);
assert(gcount() == _Count);
return (*this);
}

inline le_ifstream & readfloat(void * _Ptr)
{
return (readint(_Ptr));
}

le_ifstream & readint(void * _Ptr)
{
char _Buf[4]; static int _One = 1;
iostream::read(_Buf, 4);
assert(good());
*((unsigned int *)_Ptr) = (*((char *)&_One) != 0) ? (*((unsigned int
*)&_Buf)) : ((_Buf[3] << 24) + (_Buf[2] << 16) + (_Buf[1] << 8) + _Buf[0]);
return (*this);
}

le_ifstream & readshort(void * _Ptr)
{
char _Buf[2]; static int _One = 1;
iostream::read(_Buf, 2);
assert(good());
*((unsigned short *)_Ptr) = (*((char *)&_One) != 0) ? (*((unsigned short
*)&_Buf)) : ((_Buf[1] << 8) + _Buf[0]);
return (*this);
}

};
 
J

James Kanze

I've been trying to get it right for the last few days and
wasn't 100% successful. I trying to add functionality to the
ifstream class creating an le_ifstream derived class (see at
the end of this message).

That's usually not a very good idea, although it depends on the
type of functionality you're adding.
I also want to have some cast of casting from ifstream it
le_ifstream. Here's what I'm trying to do:
void foo(ifstream & ifs)
{
le_ifstream le_ifs(ifs); // build/cast the istream to an
le_ifstream
}
So far, I could only make it work if deriving from iostream
(see below). However, I see weird behavior (e.g. seek pointer
get out of wack). I read from a file 200 Kb in size and after
3 calls to readshort(), the assert(good()) fails (gcount()
returns 0), while this works with C calls or if I read the
data directly using the ifstream class.

If you're creating an istream, why derive from iostream?
{
public :
inline le_ifstream(ifstream & _Istr) : iostream(_Istr.rdbuf())

Just a note: names starting with an underscore folllowed by a
capital letter are reserved for the implementation. Using them
in your code is undefined behavior.
le_ifstream & read(void * _Ptr, size_t _Count)
{
iostream::read((char *)_Ptr, _Count);
assert(gcount() == _Count);

That's rather brutal error handling:). (But maybe you just did
it for test purposes.)

In this case, it would be more idiomatic to just test the flux
itself:

if ( ! *this ) {
// error...
}

Also, I'm not sure I see the reason for this function. Anytime
the pointer passed to istream::read does not point to an array
of char, there's probably an error in the code anyway. All your
function does is mask this error, so that it won't be caught by
the compiler.
return (*this);
}
inline le_ifstream & readfloat(void * _Ptr)
{
return (readint(_Ptr));

What? A float is not an int. I think I know what you're trying
to do, but be aware that it is very unportable, although it will
work on PC's and most mainstream Unix machines (but not on any
mainframes I know, and likely not on some embedded processors).

If you're going to do this, at the very least, document 1) the
external format of your floating point values (i.e. IEEE,
big endian or little endian, as the case may be), and put a big
comment somewhere to the effect that the code only works on
implementations using IEEE floating point.
le_ifstream & readint(void * _Ptr)

Why void* as argument? Shouldn't it be either int* or int32_t*,
or their unsigned equivalent? (Given the code which follows,
the actual pointer must be either int32_t* or uint32_t*. On
most common implementations, these are in fact int and unsigned
int, however, and some implementations don't implement
which is part of C99 said:
{
char _Buf[4]; static int _One = 1;
iostream::read(_Buf, 4);
assert(good());

Now comes the fun.

First, you do know that good() can fail even if the input
succeeded, don't you? (Obviously not---I'm being facetious.)
The names of the state functions in ios are very poorly chosen.
What you want to test here isn't good, but simply the stream
itself, e.g.:

if ( ! *this ) {
// error...
}

Secondly (and possibly a very real source of your problem): this
sort of input (given what you do next) only works if the file
was opened in binary mode. Otherwise, anything but a printable
character or a very few specific control characters are
undefined behavior. (In practice, the mode doesn't matter under
Unix, but under Windows, a binary value of 0x1A will be
interpreted as end of file, and a binary value 0x0D will be
deleted if it is followed by a binary value 0x0A. Other OS's
may have even more exotic conventions.)
*((unsigned int *)_Ptr) = (*((char *)&_One) != 0) ? (*((unsigned int
*)&_Buf)) : ((_Buf[3] << 24) + (_Buf[2] << 16) + (_Buf[1] << 8) + _Buf[0]);
return (*this);

What the? Why the conditional? Especially as the true case
results in undefined behavior, and may cause a core dump?

My version of this function:

le_ifstream&
le_ifstream::readint(
int32_t* dest )
{
char buffer[ 4 ] ;
istream::read( buffer, 4 ) ;
if ( *this ) {
*dest = ((buffer[ 3 ] & 0xFF) << 24)
| ((buffer[ 2 ] & 0xFF) << 16)
| ((buffer[ 1 ] & 0xFF) << 8)
| ((buffer[ 0 ] & 0xFF) ) ;
}
return *this ;
}

(except that I'd probably read and write in big endian order,
since that is more or less the network standard).

Note the &0xFF -- this is necessary to mask out an extended sign
bit. (Alternatively, you can declare the buffer unsigned char.
In which case, you'll need a reinterpret_cast in the call to
istream::read().)
le_ifstream & readshort(void * _Ptr)
{
char _Buf[2]; static int _One = 1;
iostream::read(_Buf, 2);
assert(good());
*((unsigned short *)_Ptr) = (*((char *)&_One) != 0) ? (*((unsigned short
*)&_Buf)) : ((_Buf[1] << 8) + _Buf[0]);
return (*this);
}
};

See above. All of the same comments apply.
 
M

mc

Thanks James.

Let me add some reasons to some of the whys. First off, what I provided is
just a small snapshot of a very large file which parses an old data storage
file. I did not include the documentation with this post as it's north of
hundreds of pages. So the fact the the float is read as an int is very well
detailed in said docs. But that's beyond the point.

Then, as I mentioned unclearly, the heart of the problem for me lies in the
constructor: how can I create an le_ifstream off of an existing ifstream??
The ifstream points to a file opened in binary mode that actually contains
many files stuck one after the other. Each time a file of a different type
is encountered, it's parsed using an ifstream cast/converted into the
relevant derived class, e.g. le_ifstream. How I read the data is
irrelevant. How I instanciate the derived class is.

Regards,

MC


I've been trying to get it right for the last few days and
wasn't 100% successful. I trying to add functionality to the
ifstream class creating an le_ifstream derived class (see at
the end of this message).

That's usually not a very good idea, although it depends on the
type of functionality you're adding.
I also want to have some cast of casting from ifstream it
le_ifstream. Here's what I'm trying to do:
void foo(ifstream & ifs)
{
le_ifstream le_ifs(ifs); // build/cast the istream to an
le_ifstream
}
So far, I could only make it work if deriving from iostream
(see below). However, I see weird behavior (e.g. seek pointer
get out of wack). I read from a file 200 Kb in size and after
3 calls to readshort(), the assert(good()) fails (gcount()
returns 0), while this works with C calls or if I read the
data directly using the ifstream class.

If you're creating an istream, why derive from iostream?
{
public :
inline le_ifstream(ifstream & _Istr) : iostream(_Istr.rdbuf())

Just a note: names starting with an underscore folllowed by a
capital letter are reserved for the implementation. Using them
in your code is undefined behavior.
le_ifstream & read(void * _Ptr, size_t _Count)
{
iostream::read((char *)_Ptr, _Count);
assert(gcount() == _Count);

That's rather brutal error handling:). (But maybe you just did
it for test purposes.)

In this case, it would be more idiomatic to just test the flux
itself:

if ( ! *this ) {
// error...
}

Also, I'm not sure I see the reason for this function. Anytime
the pointer passed to istream::read does not point to an array
of char, there's probably an error in the code anyway. All your
function does is mask this error, so that it won't be caught by
the compiler.
return (*this);
}
inline le_ifstream & readfloat(void * _Ptr)
{
return (readint(_Ptr));

What? A float is not an int. I think I know what you're trying
to do, but be aware that it is very unportable, although it will
work on PC's and most mainstream Unix machines (but not on any
mainframes I know, and likely not on some embedded processors).

If you're going to do this, at the very least, document 1) the
external format of your floating point values (i.e. IEEE,
big endian or little endian, as the case may be), and put a big
comment somewhere to the effect that the code only works on
implementations using IEEE floating point.
le_ifstream & readint(void * _Ptr)

Why void* as argument? Shouldn't it be either int* or int32_t*,
or their unsigned equivalent? (Given the code which follows,
the actual pointer must be either int32_t* or uint32_t*. On
most common implementations, these are in fact int and unsigned
int, however, and some implementations don't implement
which is part of C99 said:
{
char _Buf[4]; static int _One = 1;
iostream::read(_Buf, 4);
assert(good());

Now comes the fun.

First, you do know that good() can fail even if the input
succeeded, don't you? (Obviously not---I'm being facetious.)
The names of the state functions in ios are very poorly chosen.
What you want to test here isn't good, but simply the stream
itself, e.g.:

if ( ! *this ) {
// error...
}

Secondly (and possibly a very real source of your problem): this
sort of input (given what you do next) only works if the file
was opened in binary mode. Otherwise, anything but a printable
character or a very few specific control characters are
undefined behavior. (In practice, the mode doesn't matter under
Unix, but under Windows, a binary value of 0x1A will be
interpreted as end of file, and a binary value 0x0D will be
deleted if it is followed by a binary value 0x0A. Other OS's
may have even more exotic conventions.)
*((unsigned int *)_Ptr) = (*((char *)&_One) != 0) ? (*((unsigned int
*)&_Buf)) : ((_Buf[3] << 24) + (_Buf[2] << 16) + (_Buf[1] << 8) +
_Buf[0]);
return (*this);

What the? Why the conditional? Especially as the true case
results in undefined behavior, and may cause a core dump?

My version of this function:

le_ifstream&
le_ifstream::readint(
int32_t* dest )
{
char buffer[ 4 ] ;
istream::read( buffer, 4 ) ;
if ( *this ) {
*dest = ((buffer[ 3 ] & 0xFF) << 24)
| ((buffer[ 2 ] & 0xFF) << 16)
| ((buffer[ 1 ] & 0xFF) << 8)
| ((buffer[ 0 ] & 0xFF) ) ;
}
return *this ;
}

(except that I'd probably read and write in big endian order,
since that is more or less the network standard).

Note the &0xFF -- this is necessary to mask out an extended sign
bit. (Alternatively, you can declare the buffer unsigned char.
In which case, you'll need a reinterpret_cast in the call to
istream::read().)
le_ifstream & readshort(void * _Ptr)
{
char _Buf[2]; static int _One = 1;
iostream::read(_Buf, 2);
assert(good());
*((unsigned short *)_Ptr) = (*((char *)&_One) != 0) ? (*((unsigned short
*)&_Buf)) : ((_Buf[1] << 8) + _Buf[0]);
return (*this);
}
};

See above. All of the same comments apply.
 
J

James Kanze

Let me add some reasons to some of the whys. First off, what
I provided is just a small snapshot of a very large file which
parses an old data storage file. I did not include the
documentation with this post as it's north of hundreds of
pages. So the fact the the float is read as an int is very
well detailed in said docs. But that's beyond the point.

OK. If the actual format is well documented, that's half the
battle won already.

Note, however, that reading a float as an int is still very
implementation dependent, since the actual internal format of a
float varies between machines. The documentation should specify
the format of the float in the file, either by reference to some
known format (e.g. IEEE), or by explicitely specifying how the
data is laid out (high bit sign, then 11 bits exposant, in
excess 1023, then mantissa, base 2...---most people just use an
established format).
Then, as I mentioned unclearly, the heart of the problem for
me lies in the constructor: how can I create an le_ifstream
off of an existing ifstream??

In a certain sense, you can't. iostream's aren't copiable.
The usual solution binary I/O is to create a new stream type,
iMyFormatStream, which derives from std::ios (for the error
handling), and define the usual >> and << operators on it (but
in a way that they do binary I/O, in your format, and not text
I/O). The basic abstraction of an istream or an ostream is
text, and deriving from it for binary I/O is a potential source
of errors.

All that you use from an existing, open stream, would be the
streambuf. And even then, if the file is pure binary, it's
generally preferable to provide your own derived classes, along
the model of ifstream, and take care of the open yourself. That
way, you can ensure that the filebuf is opened in binary mode,
and that it is imbued with the "C" locale. (Another thing you
have to watch out for---some locales might do code translation.)
The ifstream points to a file opened in binary mode that
actually contains many files stuck one after the other. Each
time a file of a different type is encountered, it's parsed
using an ifstream cast/converted into the relevant derived
class, e.g. le_ifstream. How I read the data is irrelevant.

As long as it is correct. The code you posted had several
serious errors, which would cause it to fail in some specific
cases.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top