reading files in blocks

U

Urs Thuermann

I want to read blocks of a file in a loop and I am looking for the
most elegant way to do this. Assume there is a function
process_data() that I don't want to call with zero-length data. In C
I write

while ((nbytes = read(fd, buffer, sizeof(buffer))) > 0) {
process_data(buffer, nbytes);
}

AFAICS, in C++ I should use ifstream::read() for reading from the
file, but that method does not return the number of bytes successfully
read. Unfortunately, like feof(fp) in C, the method ifstream::eof()
does signal the EOF state of the file one byte too late, i.e. only
*after* trying to read one more byte after the last one has been read,
instead of before. Therefore, I cannot write

while (!file.eof()) {
file.read(buffer, sizeof(buffer));
nbytes = file.gcount();
process_data(buffer, nbytes);
}

but instead one must write

while (!file.eof()) {
file.read(buffer, sizeof(buffer));
nbytes = file.gcount();
if (nbytes == 0) // or
break; // if (nbytes > 0)
process_data(buffer, nbytes); // process_data(buffer, nbytes);
}

which I find ugly because of the two loop termination conditions in
the first case (left) and because of the two almost identical
conditions and the additional indentation in the second case (right).

Currently, I have the following, which comes closest to what I've
written hundred of times in C, but does not look quite as clean and
simple:

while (file.read(buffer, sizeof(buffer)), (nbytes = file.gcount()) > 0) {
process_data(buffer, nbytes);
}

So, my question is if there is another simple and elegant way to do this.


urs
 
U

u2

I want to read blocks of a file in a loop and I am looking for the
most elegant way to do this.  Assume there is a function
process_data() that I don't want to call with zero-length data.  In C
I write

    while ((nbytes = read(fd, buffer, sizeof(buffer))) > 0) {
        process_data(buffer, nbytes);
    }

AFAICS, in C++ I should use ifstream::read()

In C++, the while loop above is fine. It would be natural, if you
processed "raw" data.
 
J

James Kanze

I want to read blocks of a file in a loop and I am looking for the
most elegant way to do this. Assume there is a function
process_data() that I don't want to call with zero-length data. In C
I write
while ((nbytes = read(fd, buffer, sizeof(buffer))) > 0) {
process_data(buffer, nbytes);
}

Not in standard C: read is a Posix function. Still, for raw
data, I tend to use the system specific low level accesses,
rather than iostream (or FILE* in C). Both FILE* and iostream
are really designed for streaming text. Both have added support
for random access and "binary", but in but cases, the support is
just that: added on.

Another alternative with iostream is to get the streambuf, and
call sgetn on it. (Note that doing so will *not* set any of the
error bits in the istream.)
AFAICS, in C++ I should use ifstream::read() for reading from the
file, but that method does not return the number of bytes successfully
read.

To get the number of bytes read by the last unformatted read,
use gcount. Don't forget that not reading the asked for number
of bytes is treated as an error condition. To read the entire
file, where the last block may not be complete:

while (in.read(...) || in.gcount() != 0) {
// read n.gcount() bytes...
}
Unfortunately, like feof(fp) in C, the method ifstream::eof()
does signal the EOF state of the file one byte too late, i.e. only
*after* trying to read one more byte after the last one has been read,
instead of before.

I think you're confusing ios::eof() with something else. Until
a read has failed ios::eof() is more or less useless; you can't
count on its state one way or another.
Therefore, I cannot write
while (!file.eof()) {
file.read(buffer, sizeof(buffer));
nbytes = file.gcount();
process_data(buffer, nbytes);
}
but instead one must write
while (!file.eof()) {
file.read(buffer, sizeof(buffer));
nbytes = file.gcount();
if (nbytes == 0) // or
break; // if (nbytes > 0)
process_data(buffer, nbytes); // process_data(buffer, nbytes);
}
which I find ugly because of the two loop termination conditions in
the first case (left) and because of the two almost identical
conditions and the additional indentation in the second case (right).

The standard pattern (both in C and in C++) is to read, then
check whether it has succeeded. In C++, there is an implicit
conversion of the stream object to something which can be used
as a boolean, and all input functions return a reference to the
stream object, so you can just write:

while (file >> target) ...

In the case of istream::read, you have to take the additional
actions I explained above, because istream::read considers
reading anything less than the number of bytes requested an
error.
Currently, I have the following, which comes closest to what I've
written hundred of times in C, but does not look quite as clean and
simple:
while (file.read(buffer, sizeof(buffer)), (nbytes = file.gcount()) > 0) {
process_data(buffer, nbytes);
}
So, my question is if there is another simple and elegant way
to do this.

Replace the comma operator with ||, and you're good. This is
the idiomatic solution. Alternatively, with streambuf:

unsigned byteCount = file.rdbuf()->sgetn(buffer, sizeof(buffer));
while (byteCount != 0) {
processData(buffer, byteCount);
byteCount = file.rdbuf()->sgetn(buffer, sizeof(buffer));
}
file.setstate(std::ios::eof | std::ios::fail);

(The last line is only necessary if there is a possibility of
some other code using the istream later.)

Or you could just create the filebuf directly, and skip the
istream completely.
 
R

red floyd

Or, if OP thinks the two test while loop is ugly, he can always
abstract it away:


int read(std::istream& f, void *buffer, int bufsize)
{
int nbytes = bufsize;
if(f.read(static_cast<char*>buffer, bufsize) ||
(nbytes = file.gcount()) > 0)
return nbytes;
else if (f.eof())
return 0;
else
return -1;

}


// ...

int nbytes;
while ((nbytes = read(f,buffer,sizeof(buffer)) > 0)
process_data(buffer, nbytes);
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top