Mark Hobley said:
I want to read a text file a line at a time from within a C program. Are there
some available functions or code already written that does this or do I need
to code from scratch?
If I am doing this from scratch, what is the best practise for allocating
a buffer size for the input line?
The simplest method is to start with guess for the length of the
longest line and allocate as much. Now you use fgets() to read in
a line and check if it ends in a '\n' - if it does everything is
ok but if it doesn't the line was too long to fit into the buffer
you started of with. In that case you jincrease the size of the
buffer, e.g. by doubling its size, using realloc(), and try to
read the rest of the line by calling fgets() again (but with the
first argument pointing into the buffer were the last try stopped).
Then repeat the test for the final '\n' and repeat increasing the
buffer size if necessary. If you don't run out of memory you end
up with a buffer that contains the complete line.
The only special case you may have to consider is that the last
line of a file may not end with a '\n' and then, of course, also
what fgets() reads in can't contain that character - but if you
try to read at the very end fgets() will return NULL, so it's
possible to check for that condition.
I guess open the file, scan once to determine the buffer size, then rewind
and start reading.
I guess reading the file twice just to find out the length of the
longest line is too much work.
Has this already been done or do I need to code this from
scratch?
Probably everyone being faced with the problem of reading lines of
arbitary length will have written such a function at least once;-)
Here's something I found looking through my files (although with
quite a number changes to the original, so be wary, I may have
broken it!):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define LEN_GUESS 128
int
read_line( FILE * fp,
char ** line )
{
static char *buf = NULL;
static size_t buf_len = LEN_GUESS;
char *p = buf;
size_t rem_len = buf_len;
if ( ! fp || ! line )
return -1; /* bad argument(s) */
if ( ! buf && ! ( buf = p = malloc( buf_len ) ) )
return -1; /* running out of memory */
*buf = '\0';
while ( 1 )
{
size_t len;
char *tmp;
if ( ! fgets( p, rem_len, fp ) )
{
if ( ferror( fp ) )
return -1; /* read failure */
break;
}
len = strlen( p );
if ( p[ len - 1 ] == '\n')
break;
if ( ! ( tmp = realloc( buf, 2 * buf_len ) ) )
return -1; /* running out of memory */
buf = tmp;
p += len;
rem_len += buf_len - len;
buf_len *= 2;
}
*line = buf;
return feof( fp ) ? 1 : 0; /* indicate if EOF has been reached */
}
Note that it's, of course, not thread-safe. And when you call it
again the last line returned will be overwritten. When you don't
need to call the function anymore you should free() the returned
pointer.
(My project is open source, so I can utilize GPL licenced code, if
necessary.) C89 compatible code is preferred.
Use it for whatever you want if it fits your needs (but better
check carefully that it works, it's not my tested version, I
just checked that it compiles!) And, of course, there are quite
a number of ways it could be improved, it's more meant for giving
you a better idea of how it could be done.
Regards, Jens