clearing read stream buffers

J

JG

Does anyone know a standard (or supported on Linux, Mac, Win32) way to
clear a read stream buffer (standard ANSI C file stream)?

I would even settle for a platform specific way of doing it.

And no, I know I can use direct low level I/O or non-buffered to do
reads, but for my app, I need the buffering. I can implement myself,
but this is not optimal.

Example, I open a read only file using fopen(). I periodically know
from external methods, that the underlying file has been updated.
Before I do my next read on the stream, I want to flush the read
buffers and force it to read from disk (or kernel cache).

fflush seems to work on Linux, but not sure it is standard.

I don't see any function to do this, but it seems very needed.

I can also trigger a flush of the buffers by fseeking to a much
different area of the file. Does any fseek, even offseting by 1 index
always force a flusshing of the buffers?
 
L

Luke Wu

JG said:
Does anyone know a standard (or supported on Linux, Mac, Win32) way to
clear a read stream buffer (standard ANSI C file stream)?

I would even settle for a platform specific way of doing it.

And no, I know I can use direct low level I/O or non-buffered to do
reads, but for my app, I need the buffering. I can implement myself,
but this is not optimal.

once you've determined that there is still a bunch of characters
sitting in the input stream (should include '\n' at least), then you
can use:

while( (c = fgetc(FOOFILE) != '\n' )
;
 
J

jschultz

Luke said:
once you've determined that there is still a bunch of characters
sitting in the input stream (should include '\n' at least), then you
can use:

while( (c = fgetc(FOOFILE) != '\n' )
;

He is not asking the typical "How do I forget whatever is in stdin
question?"

He is asking with an actual file on disk (not a pipe/terminal/etc) that
he performs a read upon, how can he portably and efficiently FORCE an
open read stream to re-read data he wants from disk. The problem being
that a write from another thread/stream/fd/process has changed the
underlying file and he wants to ensure that the input stream doesn't
just re-give him what is in its buffers from an earlier read.
 
L

Lawrence Kirby

Does anyone know a standard (or supported on Linux, Mac, Win32) way to
clear a read stream buffer (standard ANSI C file stream)?

I would even settle for a platform specific way of doing it.

And no, I know I can use direct low level I/O or non-buffered to do
reads, but for my app, I need the buffering. I can implement myself,
but this is not optimal.

Example, I open a read only file using fopen(). I periodically know
from external methods, that the underlying file has been updated.
Before I do my next read on the stream, I want to flush the read
buffers and force it to read from disk (or kernel cache).

The only sure way I can think of is to close and reopen the stream, using,
say, freopen().
fflush seems to work on Linux, but not sure it is standard.

It isn't. In standard C applying fflush() to an input stream invokes
undefined behaviour.
I don't see any function to do this, but it seems very needed.

I know what you mean, although surprisingly I don't remember needing it.
I can also trigger a flush of the buffers by fseeking to a much
different area of the file. Does any fseek, even offseting by 1 index
always force a flusshing of the buffers?

I hope not, that would be inefficient if the data at the new position was
already in the buffer. This approach seems to be your best bet however
short of closing and reopening the file. Maybe a double seek, one to some
far off position and then to the position you want. Even that isn't 100%
guaranteed though.

Lawrence
 
J

jschultz

One way I was thinking of doing this is:

if (setvbuf(read_stream, NULL, _IONBF, 0) != 0 ||
setvbuf(read_stream, NULL, _IOFBF, 0) != 0) {
return -1;
}

/* do read */

Any comments from the peanut gallery about how advisable / guaranteed
this is to work?
 
B

Ben Pfaff

jschultz said:
One way I was thinking of doing this is:

if (setvbuf(read_stream, NULL, _IONBF, 0) != 0 ||
setvbuf(read_stream, NULL, _IOFBF, 0) != 0) {
return -1;
}

From C99:

The setvbuf function may be used only after the stream
pointed to by stream has been associated with an open file
and before any other operation (other than an unsuccessful
call to setvbuf) is performed on the stream.

Thus, calling setvbuf() twice is always undefined, regardless of
when you do it. But you're only allowed to do it at all just
after the file is opened.
 
I

infobahn

Luke said:
once you've determined that there is still a bunch of characters
sitting in the input stream (should include '\n' at least), then you
can use:

while( (c = fgetc(FOOFILE) != '\n' )
;

You missed a ), and forgot to check for EOF.

while( (c = fgetc(FOOFILE)) != '\n' && c != EOF)
;
 
A

Alan Balmer

He is not asking the typical "How do I forget whatever is in stdin
question?"

He is asking with an actual file on disk (not a pipe/terminal/etc) that
he performs a read upon, how can he portably and efficiently FORCE an
open read stream to re-read data he wants from disk. The problem being
that a write from another thread/stream/fd/process has changed the
underlying file and he wants to ensure that the input stream doesn't
just re-give him what is in its buffers from an earlier read.

fclose(), fopen()?
 
J

jschultz

Ok, so it seems the only portable and guaranteed way to force a C input
stream to re-read from disk is to close/re-open/re-seek the file?
Doesn't that seem like a lot of overkill for a simple/common idea?
 
J

jschultz

I mean performance overkill specifically. Because our program needs to
do these kinds of reads very quickly and opening/closing/seeking and
THEN reading seems like a ton more work than it should be.
 
E

Eric Sosman

jschultz said:
Ok, so it seems the only portable and guaranteed way to force a C input
stream to re-read from disk is to close/re-open/re-seek the file?
Doesn't that seem like a lot of overkill for a simple/common idea?

The C Standard tries to stear clear of describing how
multiple programs can run concurrently in the same environment,
of how they can synchronize their operations, and of what
effects one program can have on another. This is a Good Thing,
because a programming language standard that tried to address
such matters would limit its applicability rather severely.
If the Standard adopted in 1989 had embodied the Windows 3.1
process model, for example, how useful would that Standard be
for Unix, VMS, OS/400, MVS -- or even current Windows versions?

In short, the idea is less "simple" than it may appear to
someone who's struggling with one particular issue. (And in
my three decades of writing C I can't recall needing to do
what you describe, so "common" may be hard to defend, too.)

.... and in a follow-up:
I mean performance overkill specifically. Because our program needs to
do these kinds of reads very quickly and opening/closing/seeking and
THEN reading seems like a ton more work than it should be.

When you measured the speed of fclose()/fopen()/fseek(),
how slow did you find it to be? ("It is a capital offense
to theorize before one has data." -- S. Holmes)
 
C

Chris Croughton

Example, I open a read only file using fopen(). I periodically know
from external methods, that the underlying file has been updated.
Before I do my next read on the stream, I want to flush the read
buffers and force it to read from disk (or kernel cache).

fflush seems to work on Linux, but not sure it is standard.

It isn't, fflush is only specified on output.
I don't see any function to do this, but it seems very needed.

fseek to the end? Or, if you want to read the newly added data, fseek
to the last read position (possibly with an fseek to the start first).
I can also trigger a flush of the buffers by fseeking to a much
different area of the file. Does any fseek, even offseting by 1 index
always force a flusshing of the buffers?

It's implementation dependent, the C standard doesn't say (or know)
anything about underlying buffers. At least one implementation I've
used would treat a backwards seek within a buffer as just adjusting a
pointer. The only thing which is guaranteed to work is closing the file
and opening it again (and seeking to the last read position in your
case).

However, I'd guess that doing fseek to the end of the file should always
cause any /new/ data to be read in (not necessarily anything changed in
data already read), so if the external changes are only appending to the
file I'd guess that it should work. But really it's an implementation
question, I don't think thre's any portable solution apart from closing
and opening the file each time.

Chris C
 
M

Mark McIntyre

I mean performance overkill specifically. Because our program needs to
do these kinds of reads very quickly and opening/closing/seeking and
THEN reading seems like a ton more work than it should be.

Which is why specific implementations may nor may not offer a facilityto do
this more efficiently, but in a nonstandard way. You really need to ask
experts in your platform, not generalists in C.
 
J

jschultz

I was just asking if there was a standard way to force a C input stream
to forget what's in its buffers without destroying the stream itself.
Apparently there isn't. That seems like a rather silly hole to me as
it would be VERY simple for a stream to forget what it has already read
in and then re-read on the next read call.
 
C

Chris Torek

Ok, so it seems the only portable and guaranteed way to force a C input
stream to re-read from disk is to close/re-open/re-seek the file?

Pretty much, yes.
Doesn't that seem like a lot of overkill for a simple/common idea?

It is actually quite uncommon, and sometimes not all that simple.
In particular, you first have to discover that the file has changed,
and that will take code that lies outside the purview of the C
standards (at which point you can perhaps abandon stdio entirely
and simply use open/read/lseek/close, if those are the underlying
primitives).

I included an fpurge() function in the 4.xBSD stdio, but it does
not seem to have caught on. The specification for fpurge() is,
roughly speaking, "forget all buffered data recorded in this stdio
`FILE *' thing". An fpurge() followed by an fseek() to the desired
location would do the trick (although I am not 100% sure that my
seek-optimization code would actually work right with the purge in
this case :) ).

Note that the 4.xBSD fpurge() also works on output files, discarding
as-yet-unwritten-to-the-underlying-system data.
 
R

Randy Howard

From C99:

The setvbuf function may be used only after the stream
pointed to by stream has been associated with an open file
and before any other operation (other than an unsuccessful
^^^^^^^^^^^^^^^^^^^^^^^^^^
call to setvbuf) is performed on the stream. ^^^^

Thus, calling setvbuf() twice is always undefined, regardless of
when you do it.

Unless it fails the first time and you retry? I don't see this
as very likely, but ...
 
C

Chris Croughton

I was just asking if there was a standard way to force a C input stream
to forget what's in its buffers without destroying the stream itself.
Apparently there isn't. That seems like a rather silly hole to me as
it would be VERY simple for a stream to forget what it has already read
in and then re-read on the next read call.

Not if the cards have already fallen into the hopper the first time they
were read! Not everything is a disk file...

Chris C
 
E

Eric Sosman

jschultz said:
I was just asking if there was a standard way to force a C input stream
to forget what's in its buffers without destroying the stream itself.
Apparently there isn't. That seems like a rather silly hole to me as
it would be VERY simple for a stream to forget what it has already read
in and then re-read on the next read call.

"Very simple?" Let's see you do it when reading
from the keyboard. Or from a socket, or from a sensor
connected to a serial port, or from /dev/random, or ...

You've been given a solution that works for streams
attached to reopenable data sources (seekability is helpful,
but not essential). Use it, and stop whining!
 
J

jschultz

I imagine on a non-random seek stream (socket, serial port, terminal,
/dev/random, etc.) that such a function would discard whatever it has
already read in -- say, for example, if you are using line buffering
instead of block buffering this could be useful.

To make file streams consistent with "real" streams, the file position
should probably be advanced by how ever many bytes were discarded and
on such streams could be ascertained with a call to ftell or fgetpos.

Anyway, immature and snide remarks aside, it would very easy to do this
and would prevent suprising results from pedagoguish code like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>

#define RECORD_SIZE 100

int main(int argc, char **argv)
{
FILE * writer;
FILE * reader;
char write_buf[RECORD_SIZE];
char read_buf[RECORD_SIZE];
int i;

srand(time(NULL));

if (argc != 2) {
exit(fprintf(stderr, "Usage: %s <file_out>\r\n", argv[0]));
}

if ((writer = fopen(argv[1], "r+")) == NULL) {
exit(fprintf(stderr, "fopen '%s' w/ r+ failed!\r\n", argv[1]));
}

if ((reader = fopen(argv[1], "r")) == NULL) {
exit(fprintf(stderr, "fopen '%s' w/ r failed!\r\n", argv[1]));
}

while (1) {

for (i = 0; i < RECORD_SIZE; ++i) {
write_buf = (char) rand();
}

if (fwrite(write_buf, RECORD_SIZE, 1, writer) != 1 ||
fflush(writer) != 0) {
exit(fprintf(stderr, "fwrite/fflush failed!\r\n"));
}

/* for random seek media -- insert fpurge/fseek call here */

if (fread(read_buf, RECORD_SIZE, 1, reader) != 1) {
exit(fprintf(stderr, "fread failed!\r\n"));
}

if (memcmp(read_buf, write_buf, RECORD_SIZE) != 0) {
abort();
}
}

return 0;
}

The output file must exist before running. If the output file <=
RECORD_SIZE bytes long then the program will run. Otherwise it will
immediately fail due to cache incoherence caused by the C library's
read ahead caching of "junk" data from the file's previous contents.

That would be fine, IF the C standard offerred a way around this
problem without the expensive operation of tearing down and rebuilding
the stream. The fact that it doesn't is suprising to me.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Buffers 5
fstream Buffers 26
buffers and memoryviews 0
function to read a stream 6
Read xml column inside csv file with Python 0
Byte Stream Vs Char Stream Buffer 21
How is "static buffers" defined? 1
String Buffers? 5

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top