Function to determine the number of chars in a FILE buffer

T

tm

Is there a portable solution to find out how many chars are
buffered in FILE *aFile. This interests me for a file in read
mode and when the file is a FIFO (opened with popen()).

The background is:
I want to write the function inputReady(FILE *aFile) which should
return 1 when reading from aFile will not block. Assume that aFile
is opened with popen("someProgram", "r"). This way it is a FIFO
file. Depending on someProgram reading from aFile may block or may
succeed immediate. This should be determined with inputReady().
Currently I use the following definition of inputReady:

int inputReady (FILE *aFile)

{
int file_no;
int nfds;
fd_set readfds;
struct timeval timeout;
int select_result;
int result;

/* inputReady */
file_no = fileno(aFile);
if (file_no != -1) {
FD_ZERO(&readfds);
FD_SET(file_no, &readfds);
nfds = (int) file_no + 1;
timeout.tv_sec = 0;
timeout.tv_usec = 0;
/* printf("select(%d, %d)\n", nfds, file_no); */
select_result = select(nfds, &readfds, NULL, NULL, &timeout);
/* printf("select_result: %d\n", select_result); */
if (unlikely(select_result < 0)) {
raise_error(FILE_ERROR);
result = 0;
} else {
result = FD_ISSET(file_no, &readfds);
} /* if */
} else {
raise_error(FILE_ERROR);
result = 0;
} /* if */
/* printf("filInputReady --> %d\n", result); */
return result;
} /* inputReady */

This function does not work correctly: It may return 0 although
a getc() would not block. This happens because of the characters
buffered by FILE *aFile. Since all the logic should be inside of
inputReady() using

setvbuf(aFile, NULL, _IONBF, 0)

to switch of buffering off is not an option. Instead I would like
to use something like fbufsize(aFile) which should tell me how many
chars are buffered in aFile. On my Linux system the include
file "libio.h" contains the macro:

#define _IO_getc_unlocked(_fp) \
(_IO_BE ((_fp)->_IO_read_ptr >= (_fp)->_IO_read_end, 0) \
? __uflow (_fp) : *(unsigned char *) (_fp)->_IO_read_ptr++)

From this definition I can see that fbufsize(aFile) could be
defined as:

#define fbufsize(_fp) ((_fp)->_IO_read_end - (_fp)->_IO_read_ptr)

The problem is: This is highly unportable.

My question is: Is there a portable solution for my problem?

Many thanks in advance.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
J

James Kuyper

Is there a portable solution to find out how many chars are
buffered in FILE *aFile.

There's no portable way to find out how many chars; only a portable way
to set the number of chars: setvbuf(), but you're already aware of that
option.
... This interests me for a file in read
mode and when the file is a FIFO (opened with popen()).

A fully portable C program cannot call popen() and has no way to be sure
whether or not a stream that it opens is a FIFO. You'll get better
answers to your questions in a newsgroup associated with the library
that provides popen() (and select() and fileno(), and all of the other
implementation-specific features referred to in your program).
 
I

Ike Naar

int inputReady (FILE *aFile)
{
int file_no;
int nfds;
fd_set readfds;
struct timeval timeout;
int select_result;
int result;

/* inputReady */
file_no = fileno(aFile);
if (file_no != -1) {
FD_ZERO(&readfds);
FD_SET(file_no, &readfds);
nfds = (int) file_no + 1;
timeout.tv_sec = 0;
timeout.tv_usec = 0;
/* printf("select(%d, %d)\n", nfds, file_no); */
select_result = select(nfds, &readfds, NULL, NULL, &timeout);
/* printf("select_result: %d\n", select_result); */
if (unlikely(select_result < 0)) {
raise_error(FILE_ERROR);
result = 0;
} else {
result = FD_ISSET(file_no, &readfds);
} /* if */
} else {
raise_error(FILE_ERROR);
result = 0;
} /* if */
/* printf("filInputReady --> %d\n", result); */
return result;
} /* inputReady */

This function does not work correctly: It may return 0 although
a getc() would not block.

Input may have arrived between the moment you called select()
and the moment you call getc().
 
A

Angel

On 08/31/2011 06:43 PM, tm wrote:

A fully portable C program cannot call popen() and has no way to be sure
whether or not a stream that it opens is a FIFO. You'll get better
answers to your questions in a newsgroup associated with the library
that provides popen() (and select() and fileno(), and all of the other
implementation-specific features referred to in your program).

Those are POSIX functions, I would guess he is working on some brand of
Unix and would thus be better off in comp.unix.programmer.
 
T

tm

There's no portable way to find out how many chars; only a portable way
to set the number of chars: setvbuf(), ...

I am not iterested in the total size of the buffer. Instead I am
interested in the actual number of chars in the buffer. The actual
number of chars in a buffer of size 1024 can be any integer
between 0 and 1024. As I already said: This interests me for a
file in read mode and when the file is a FIFO (e.g. opened with
popen() or other code which creates a FILE * which is a FIFO).

E.g.: When my proposed function fbufsize(aFile) returns 5 reading
5 characters from aFile would just deliver the characters from the
buffer. So reading 5 characters would not block. Reading the 6th
character would request a new char from the underlying file
descriptor, so it could possibly block (when the file is a FIFO).

My proposed function inputReady() could contain something like:

if (fbufsize(aFile) >= 1) {
return 1;
} else {
/* Use code with fileno() and select() (Linux), respectively
fileno(), fstat(), _get_osfhandle() and PeekNamedPipe()
(Windows) to find out if reading from the underlying
file descriptor would block. In this case return 0.
Otherwise (reading 1 char will not block) return 1. */
}

Without fbufsize() even setting a file to no buffering with
setvbuf() will probably not lead to a solution: An ungetc() may
have happened and there is no way to find out if the next char
read with getc() will be the one provided with ungetc(). Since
a getc() after an ungetc() will not block inputReady() should
return 1 in this case.
A fully portable C program cannot call popen() and has no way to be sure
whether or not a stream that it opens is a FIFO.

I am already aware of that situation. :) At least Windows and Linux
(and probably also any other UNIX look alikes) use FIFOs for files
opened with popen(). AFAIK DOS uses tempory files for popen() so my
function will not be useful for DOS. Btw.: popen() is just an
example. I am searching for a general solution for FIFOs.
You'll get better
answers to your questions in a newsgroup associated with the library
that provides popen() (and select() and fileno(), and all of the other
implementation-specific features referred to in your program).

Thank you for this suggestion. I added OS specific newsgoups.

I already have a solution for the implementation specific
part of inputReady(). Under Linux (UNIX, etc.) I use fileno() and
select(). Under Windows I use fileno(), fstat(), _get_osfhandle()
and PeekNamedPipe(). The only thing is: Everything is hampered by
the buffering imposed by FILE * files. That's the reason I used
comp.lang.c for my question.

Many thanks in advance.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
T

tm

Those are POSIX functions, I would guess he is working on some brand of
Unix and would thus be better off in comp.unix.programmer.

I am writing code for Unix and Windows, but my question
is about a FILE *, which is standard C.

When you call getc(aFile) two things can happen:

1. The character is already in the FILE buffer.
In this case the char is returned from the FILE
buffer and the read position of the FILE buffer
is adjusted.
2. The FILE buffer is empty.
In this case a block of e.g. 1024 characters
is read from the underlying file descriptor
into the FILE buffer. Afterwards the first
character from the FILE buffer is returned.

What I need is:
Find out if the next getc() will use case 1 or case 2.

Historically getc() was defined as macro.

Essentially I need the condition used in the getc() macro.
I could take it from there, but I want to support
several compilers, libraries and operating systems.

Thats the reason I ask for a portable solution, to
find out the number of chars currently in a FILE buffer.

Many thanks in advance.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
J

James Kuyper

I am writing code for Unix and Windows, but my question
is about a FILE *, which is standard C.

And the standard C answer is, it can't be done. The C standard library
deliberately hides such details from you. There may be Unix- or
Windows-specific solutions, but no portable C solution.
When you call getc(aFile) two things can happen:

1. The character is already in the FILE buffer.
In this case the char is returned from the FILE
buffer and the read position of the FILE buffer
is adjusted.
2. The FILE buffer is empty.
In this case a block of e.g. 1024 characters
is read from the underlying file descriptor
into the FILE buffer. Afterwards the first
character from the FILE buffer is returned.

What I need is:
Find out if the next getc() will use case 1 or case 2.

Historically getc() was defined as macro.

The fact that getc() is allowed to be a macro is the main purpose of
getc(); it's otherwise redundant with fgetc(). And it could easily be a
macro that expands into a call to fgetc().
Essentially I need the condition used in the getc() macro.

The problem is that there's no single condition used by that macro for
this purpose. The expansion of that macro is different on different
systems, so the solution to your problem will be different on different
systems.

I've looked at the details of how FILE is defined on various systems.
It's usually a structure containing one pointer to the buffer that it
maintains, and multiple pointers that point at various locations inside
that buffer. The names of those pointers generally make it clear how
they're used, and the quantity you want to calculate is the difference
of two of those pointers - but they'll have different names on different
systems, so you can't portably make use of that fact.
I could take it from there, but I want to support
several compilers, libraries and operating systems.

Thats the reason I ask for a portable solution, to
find out the number of chars currently in a FILE buffer.

Then you're out of luck.
 
J

Jasen Betts

Without fbufsize() even setting a file to no buffering with
setvbuf() will probably not lead to a solution: An ungetc() may
have happened and there is no way to find out if the next char
read with getc() will be the one provided with ungetc(). Since
a getc() after an ungetc() will not block inputReady() should
return 1 in this case.

basically you can't do non blockin IO using the stdio library
in a way that's likely to continue to work.

try somethine else (eg: glib )

http://developer.gnome.org/glib/stable/glib-IO-Channels.html
 
D

David Schwartz

I am already aware of that situation. :) At least Windows and Linux
(and probably also any other UNIX look alikes) use FIFOs for files
opened with popen(). AFAIK DOS uses tempory files for popen() so my
function will not be useful for DOS. Btw.: popen() is just an
example. I am searching for a general solution for FIFOs.

That's impossible. This operation only makes sense if FIFOs and
buffers are implemented in a particular way, which no standard
requires. If you want a portable implementation, this is the best you
can can do:

int inputReady(FILE *f) { return 0; }

If you can tell the difference between this function and your proposed
function, your code is not portable. (At least, I can't see any way a
portable program could tell the difference.)

For example, no rule prohibits an implementation from pushing a buffer
back to the FIFO and then allowing another listener connected to the
FIFO to consume it. You would need a rule prohibiting this behavior
for a portable implementation to be possible.

DS
 
T

tm

basically you can't do non blockin IO using the stdio library
in a way that's likely to continue to work.

try somethine else (eg: glib )

http://developer.gnome.org/glib/stable/glib-IO-Channels.html

Thank you for this information.
The description of the IO Channels state that they were made to
integrate file descriptors, pipes, and sockets into the main event
loop. Since my runtime library (for Seed7) works without event loop
this might be a problem for me. The page states also that the IO
Channel support for Windows is only partially complete. This would
also be a problem for me, since I want to support Windows as well.

My runtime library (for Seed7) supports a 'file' interface together
with several implementations. One 'file' implementation is for
FILE * files and another is for sockets (there are other 'file'
implementations as well and user defined 'file' implementations
are also possible).

My intention was, to handle pipes as FILE * files. But this seems to
lead to problems, when I want to introduce the function 'inputReady'
(determine if reading is possible without blocking).

Maybe it is necessary to introduce a 'file' implementation for pipes
instead of using FILE * files for pipes.

My own pipe type would be based on file descriptors, respectively
file handles (under Windows). The question is:
Can and should this pipe type work buffered or unbuffered?

For one specific function (hasNext, which determines, if the next
character will be EOF) I need a lookahead of one character, so I
need at least a buffer of one character. A bigger buffer would save
system calls so it could increase performance. But I am not sure if
this is possible and if the effort pays off.

Concerning this (writing my own pipe functions) I have questions:
Is it possible to peek from the file descriptor of a (Unix) pipe?
Is it possible to determine how many chars can be read, from the
file descriptor of a (Unix) pipe, without blocking?

Input buffering would need the number of chars available in a pipe.
Otherwise reading could block, when it is not expected.

AFAIK the Windows function PeekNamedPipe() can provide such an
information (number of chars available). But unless both Unix and
Windows can provide such information I will not implement a
pipe type with buffering.

Many thanks in advance.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
M

Mark Storkamp

tm said:
Is there a portable solution to find out how many chars are
buffered in FILE *aFile. This interests me for a file in read
mode and when the file is a FIFO (opened with popen()).

The background is:
I want to write the function inputReady(FILE *aFile) which should
return 1 when reading from aFile will not block. Assume that aFile
is opened with popen("someProgram", "r"). This way it is a FIFO
file. Depending on someProgram reading from aFile may block or may
succeed immediate. This should be determined with inputReady().
Currently I use the following definition of inputReady:

It's been explained why this can't be done is standard portable C. But
if all you want is non-blocking I/O, you might be able to use a separate
thread just for input. Let the thread get blocked and have it set a
semaphore when it has data available to be read.
 
K

Keith Thompson

James Kuyper said:
On 09/01/2011 03:23 AM, tm wrote: [...]
Historically getc() was defined as macro.

The fact that getc() is allowed to be a macro is the main purpose of
getc(); it's otherwise redundant with fgetc(). And it could easily be a
macro that expands into a call to fgetc().
[...]

Any standard C library function can be defined as a macro.
What's special about getc() is that, if it's defined as a macro,
it's permitted to evaluate its argument (a FILE*) more than once.

In practice, it would rarely make sense to call fgetc() or getc()
with an argument such that evaluating it more than once would
cause problems; the special permission is necessary to allow a more
efficient implementation.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,902
Latest member
Elena68X5

Latest Threads

Top