Segfaulting when trying to create custom 'fgetline' function

J

Jeff Rodriguez

Main just loops over this while it's not null. The segfault occurs at
this line:
*line[ iteration ] = (char)ch;

Also, please don't just fix the code. I would like to know why exactly
this isn't working so I can avoid problems with it in the future.

If there are any references I should check out let me know.

Full highlighted code at:
http://gurugeek.com/jeff/programming/fgetline/fgetline.c.html
Line 62 is where segfault occurs.

This only happens if I use 3 characters. Using 4+ characters results in
a bus error.

GDB Output:
(gdb) run
Starting program: /usr/home/jeff/public_html/programming/fgetline/a.out
a
as
asd
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x804860f in fgetline ()
(gdb) kill
Kill the program being debugged? (y or n) y
(gdb) run
Starting program: /usr/home/jeff/public_html/programming/fgetline/a.out
asdf
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGBUS, Bus error.
0x804860f in fgetline ()
(gdb)



-- SNIP --
/* {{{ Code Fold: fgetline()
*/
unsigned int fgetline( FILE *fp, char **line )
{
const unsigned int BUFFER_SIZE = 1024; /* Base buffer size
*/
unsigned int iteration = 0; /* Number of characters
read */
char *tmp = NULL; /* Temp ptr to catch
realloc() */
int ch; /* Currently read character
*/

/* {{{ Code Fold: Loop until we reach EOF or a newline
*/
while ( (ch = fgetc(fp)) != '\n' && !feof(fp))
{
/* If there is no remainder for iteration / BUFFER_SIZE we've
reached */
/* a multiple of BUFFER_SIZE and it's time to reallocate the
buffer */
if ( !(iteration % BUFFER_SIZE) )
{
/* {{{ Code Fold: (Re)allocte memory
*/
tmp = realloc(
(*line),
(iteration / BUFFER_SIZE + 1)
* BUFFER_SIZE
* sizeof(char)
);
if ( tmp == NULL )
{
puts("Memory (re)allocation failed\n");
exit(EXIT_FAILURE);
}

/* If the address of our string changed, give *line that
address */

*line = tmp;

/* }}} Code Fold: (Re)allocte memory
*/
}

*line[ iteration ] = (char)ch;
iteration++;
}
/* }}} Code Fold: Loop until we reach EOF or a newline
*/

// Null terminate the array to make it a C string
line[ iteration + 1 ] = '\0';
if ( iteration == 0 )
{
return NULL;
}
else
{
return iteration;
}
}
/* }}} Code Fold: fgetline()
*/
-- SNIP --
 
A

Al Bowers

Jeff said:
Main just loops over this while it's not null. The segfault occurs at
this line:
*line[ iteration ] = (char)ch;

Also, please don't just fix the code. I would like to know why exactly
this isn't working so I can avoid problems with it in the future.

There are serveral errors.
If there are any references I should check out let me know.

Full highlighted code at:
http://gurugeek.com/jeff/programming/fgetline/fgetline.c.html
Line 62 is where segfault occurs.

This only happens if I use 3 characters. Using 4+ characters results in
a bus error.

GDB Output:
(gdb) run
Starting program: /usr/home/jeff/public_html/programming/fgetline/a.out
a
as
asd
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x804860f in fgetline ()
(gdb) kill
Kill the program being debugged? (y or n) y
(gdb) run
Starting program: /usr/home/jeff/public_html/programming/fgetline/a.out
asdf
(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGBUS, Bus error.
0x804860f in fgetline ()
(gdb)



-- SNIP --
/* {{{ Code Fold: fgetline() */
unsigned int fgetline( FILE *fp, char **line )
{
const unsigned int BUFFER_SIZE = 1024; /* Base buffer size */
unsigned int iteration = 0; /* Number of characters
read */
char *tmp = NULL; /* Temp ptr to catch
realloc() */
int ch; /* Currently read character
*/

/* {{{ Code Fold: Loop until we reach EOF or a newline */
while ( (ch = fgetc(fp)) != '\n' && !feof(fp))
{
/* If there is no remainder for iteration / BUFFER_SIZE we've
reached */
/* a multiple of BUFFER_SIZE and it's time to reallocate the
buffer */
if ( !(iteration % BUFFER_SIZE) )
{
/* {{{ Code Fold: (Re)allocte memory */
tmp = realloc(
(*line),
(iteration / BUFFER_SIZE + 1)
* BUFFER_SIZE
* sizeof(char)
);

This did not cause the seg. fault but there is a bug here.
If you enter in BUFFER_SIZE characters and the the newline
character, you have allocated BUFFER_SIZE space but later
you try to write to BUFFER_SIZE+1 with the statement that
adds the '\0' character.

Change this to:
tmp = reaaloc( *line, iteration + BUFFER_SIZE +1);
if ( tmp == NULL )
{
puts("Memory (re)allocation failed\n");
exit(EXIT_FAILURE);
}

/* If the address of our string changed, give *line that
address */

*line = tmp;

/* }}} Code Fold: (Re)allocte memory */
}

*line[ iteration ] = (char)ch;

This is probably a cause of the seg. fault. Review the precedence
rules and you will see that you need to force the * operator.
(*line)[iteration] = ch;
iteration++;
}
/* }}} Code Fold: Loop until we reach EOF or a newline */

// Null terminate the array to make it a C string
line[ iteration + 1 ] = '\0';

(*line)[iteration] = '\0';

But still there is a problem. If you just type in the newline
character, iteration stays 0, no allocations are made and still
you attempt to assign the '\0' character.
if ( iteration == 0 )
{
return NULL;

You designed the function to return type unsigned. Why to you
return NULL?
}
else
{
return iteration;
}
}
/* }}} Code Fold: fgetline() */
-- SNIP --

With these corrects, I believe the function will work.

#include <stdlib.h>
#include <stdio.h>

unsigned int fgetline( FILE *fp, char **line )
{
const unsigned int BUFFER_SIZE = 3;
unsigned int iteration = 0;
char *tmp = NULL;
int ch;

while ( (ch = fgetc(fp)) != '\n' && !feof(fp))
{
if ( !(iteration % BUFFER_SIZE) )
{
tmp = realloc(*line,iteration+BUFFER_SIZE+1);
if ( tmp == NULL )
{
puts("Memory (re)allocation failed\n");
exit(EXIT_FAILURE);
}
*line = tmp;
}
(*line)[ iteration ] = (char)ch;
iteration++;
}
if ( iteration == 0 )
{
return 0;
}
else
{
(*line)[ iteration] = '\0';
return iteration;
}
}

int main (void)
{
while ( 1 )
{
char *line = NULL;
if ( fgetline(stdin, &line) )
{
printf("%s\n", line);
free(line);
line = NULL;
}
else
{
break;
}
}
return EXIT_SUCCESS;
}
 
S

Sheldon Simms

Also, please don't just fix the code. I would like to know why exactly
this isn't working so I can avoid problems with it in the future.

Ok. It's not working because you aren't paying attention to the
precedence of operators

....

Sorry, I'm going to have to fix the code. Ignore the rest
if you don't want to see your error.
/* {{{ Code Fold: fgetline() */
unsigned int fgetline( FILE *fp, char **line )
{
const unsigned int BUFFER_SIZE = 1024; /* Base buffer size */
unsigned int iteration = 0; /* Number of characters ... */
char *tmp = NULL; /* Temp ptr to catch ... */
int ch; /* Currently read character */

/* {{{ Code Fold: Loop until we reach EOF or a newline */
while ( (ch = fgetc(fp)) != '\n' && !feof(fp))
{
/* If there is no remainder for iteration / BUFFER_SIZE ... */
/* a multiple of BUFFER_SIZE and it's time to reallocate the ... */

This also allocates a buffer the first time through the loop. I'm sure
you are aware of that, but your comment doesn't mention it.
if ( !(iteration % BUFFER_SIZE) )
{
/* {{{ Code Fold: (Re)allocte memory */
tmp = realloc(
(*line),
(iteration / BUFFER_SIZE + 1)
* BUFFER_SIZE
* sizeof(char)
);

sizeof(char) == 1 by definition, there's no need to use it here. If
you think you're preparing for some future change in type of the
buffers you are allocating then do it like this:

tmp = realloc((*line),
(iteration / BUFFER_SIZE + 1)
* BUFFER_SIZE * sizeof *tmp);
if ( tmp == NULL )
{
puts("Memory (re)allocation failed\n");
exit(EXIT_FAILURE);
}

/* If the address of our string changed, give *line that address */

*line = tmp;

/* }}} Code Fold: (Re)allocte memory */
}

*line[ iteration ] = (char)ch;

Here is the problem. This means

*(line[iteration]) = (char)ch;

So you are treating line as an array of pointers, selecting the
"iteration'th" pointer from that array and storing the char where
ever it points. What you meant to do was

(*line)[iteration] = (char)ch;

But you could have avoided the problem in the first place by
simply using the /other/ pointer to your buffer:

tmp[iteration] = (char)ch;
iteration++;
}
/* }}} Code Fold: Loop until we reach EOF or a newline */

/* Null terminate the array to make it a C string */
line[ iteration + 1 ] = '\0';
if ( iteration == 0 )
{
return NULL;

Why are you returning NULL from a function that is declared to
return unsigned int?
 
F

Floyd Davidson

Jeff Rodriguez said:
Main just loops over this while it's not null. The segfault
occurs at this line:
*line[ iteration ] = (char)ch;

Also, please don't just fix the code. I would like to know why
exactly this isn't working so I can avoid problems with it in
the future.

....

Your code is formatted in a particularly ugly style which makes
it very hard to read (and it gets worse when you post it with
software that folds long lines!). Plus it is impossible, from
the example given, to tell just how you are calling this
function. (That all sounds like a bunch of nitpicking, but it
really does make debugging easier when you can *see* what the
code is doing, in blocks the size of your monitor's screen.)
/* {{{ Code Fold: fgetline() */

Using a different indenting style would eliminate the need for
the "Code Fold" business...
unsigned int fgetline( FILE *fp, char **line )
{
const unsigned int BUFFER_SIZE = 1024; /* Base buffer size
*/

While 1024 is probably a reasonable size for the ultimate
implementation, you might want to make that something fairly
small, say 10, for testing. That way you can easily check
the boundaries.
unsigned int iteration = 0; /* Number of
characters read */
char *tmp = NULL; /* Temp ptr to
catch realloc() */
int ch; /* Currently read
character */

/* {{{ Code Fold: Loop until we reach EOF or a newline */
while ( (ch = fgetc(fp)) != '\n' && !feof(fp))

Looping on feof() is not good. First, your loop will never
notice a read error, but also the feof() function doesn't
indicate that you are at the end of a file, it indicates
that you've tried to read past the end of a file, which comes
one iteration after you've reached the end.

while ((ch = fgetc(fp)) != '\n' && ch != EOF) { ... }

This will catch both an end of file condition and a read error.
If you want to know why the loop exited, test for ch == '\n',
feof(), and ferror() following the loop.
{
/* If there is no remainder for iteration / BUFFER_SIZE
we've reached */
/* a multiple of BUFFER_SIZE and it's time to reallocate
the buffer */
if ( !(iteration % BUFFER_SIZE) )
{
/* {{{ Code Fold: (Re)allocte memory */
tmp = realloc(
(*line),
(iteration / BUFFER_SIZE + 1)
* BUFFER_SIZE
* sizeof(char)

There is no point in multiplying by "sizeof char", because that
is guaranteed by the C Standard to be 1.
if ( tmp == NULL )
{
puts("Memory (re)allocation failed\n");
exit(EXIT_FAILURE);
}

/* If the address of our string changed, give *line
that address */

*line = tmp;

/* }}} Code Fold: (Re)allocte memory */
}

*line[ iteration ] = (char)ch;

The cast is unnecessary, but this isn't doing what you think it
is. You want to get the iteration element of the dereferenced
pointer from line, but instead you are getting the iteration
element of line and dereferencing that... which is why it
segfaults.

(*line)[iteration] = ch;

Note that that might be cleaner to just use the tmp variable,

tmp[iteration] = ch;
iteration++;

Note that you have just incremented the interation count to the
next uninitialized element in the allocated space...
}
/* }}} Code Fold: Loop until we reach EOF or a newline */

// Null terminate the array to make it a C string
line[ iteration + 1 ] = '\0';

.... and here you skip past that last unallocated element in your
memory space, and put a null character one past that.

And you've make a typo that changes what it means too!

(*line)[iteration] = '\0';

Again, the tmp variable is easier to see,

tmp[iteration] = '\0';
if ( iteration == 0 )
{
return NULL;

NULL is a pointer, this function is defined to return a type
unsigned int. Change that to 0 and enable enough compiler
warnings to let it tell you about such problems.

However, since iteration is 0, why not just eliminate the
entire if-else, and just return iteration.
}
else
{
return iteration;
}
}
/* }}} Code Fold: fgetline() */
-- SNIP --

Here is your code, reformatted (for example, with shorter
indents and with lines kept to less that 64 columns) with
corrections and with a main() so that it can be compiled.

#include <stdio.h>
#include <stdlib.h>

unsigned int fgetline(FILE *, char **);

int main(void)
{
char *txt;
int count;

while (1) {
txt = NULL;
count = fgetline(stdin, &txt);
printf("txt [%u]: <%s>\n", count, txt);
free(txt);
}

return EXIT_SUCCESS;
}

unsigned int
fgetline( FILE *fp, char **line )
{
int ch; /* current char read */
char *tmp = NULL; /* realloc() tmp ptr */
unsigned int iteration = 0; /* char read count */
const unsigned int BUFFER_SIZE = 10; /* Base buffer size */

/* Loop until we reach EOF or a newline */
while ( (ch = fgetc(fp)) != '\n' && !feof(fp)) {
/*
* If there is no remainder for iteration / BUFFER_SIZE
* we've reached a multiple of BUFFER_SIZE and it's time
* to reallocate the buffer
*/
if ( !(iteration % BUFFER_SIZE) ) {
/* realloc the input buffer */
tmp = realloc(
*line,
(iteration / BUFFER_SIZE + 1) * BUFFER_SIZE);

if (tmp == NULL) {
puts("Memory (re)allocation failed\n");
exit(EXIT_FAILURE);
}

*line = tmp; /* new buffer address */
}
tmp[iteration] = ch;
iteration++;
}
tmp[iteration] = '\0'; /* make it a string */
return iteration;
}
 
J

Jeff Rodriguez

Jeff Rodriguez wrote:

Alright, I've taken everyone's comments into consideration and modified
the code. I kept parts of the code folding, but the text " Code Fold"
has been removed to hopefully improve readability.

Now I'm basically submitting for review, and the use of anyone who finds it
useful. It seems relatively quick:

-- TEST --
jeff@andromeda% gcc -w -g -O fgetline.c ; time ./a.out < testfile
1000
1024
1025
451365
0.268u 0.061s 0:00.33 96.9% 23+1005k 0+0io 0pf+0w
-- TEST --

0.33 seconds for a 443k file is pretty good right?

HTMLized at: http://gurugeek.com/jeff/programming/fgetline/fgetline.c.html
-- SNIP --
/* {{{: Documentation */
/*============================================================================*/
/* FGETLINE UTILITY FUNCTION */
/*----------------------------------------------------------------------------*/
/* Function: unsigned int fgetline( FILE *fp, char **line ) */
/* */
/* Operations: Read a line delimited by '\n' from open file at fp */
/* */
/* Returns: Number of characters read, including trailing newline character */
/* -1 if 0 characters are read, or if EOF is found */
/* */
/* */
/* */
/* History: */
/* Date V.M Description Who */
/* -------- ---- -------------------------------------------------- --- */
/* 03-11-15 1.00 Initial implementation JAR */
/* */
/*----------------------------------------------------------------------------*/
/* Copyright (C) 2003 Jeff A. Rodriguez - All rights reserved. */
/*============================================================================*/
/* }}}: Documentation */
/* {{{: Includes */
#include <stdlib.h> /* malloc(), realloc() */
#include <stdio.h> /* fgetc() */
/* }}}: Includes */
/* {{{: fgetline() */
unsigned int fgetline( FILE *fp, char **line )
{
const unsigned int BUFFER_SIZE = 1024; /* Base buffer size */
unsigned int iteration = 0; /* Number of characters read */
char *tmp = NULL; /* Temp ptr to catch realloc() */
int ch; /* Currently read character */

/* {{{: Loop until we reach EOF or a newline */
while ((ch = fgetc(fp)) != '\n' && ch != EOF)
{
/* If there is no remainder for iteration / BUFFER_SIZE we've reached
a multiple of BUFFER_SIZE and it's time to reallocate the buffer */
if ( !(iteration % BUFFER_SIZE) )
{
/* {{{: (Re)allocate memory */
/* realloc() acts like malloc() here the first time through. Also
we make sure there's enough room for the \0 character at the end
to make this a string */
tmp = realloc(*line, (iteration / BUFFER_SIZE + 1) * BUFFER_SIZE);
if ( tmp == NULL )
{
puts("Memory (re)allocation failed");
exit(EXIT_FAILURE);
}
*line = tmp; // Make sure the address of *line is correct
/* }}}: (Re)allocate memory */
}

// Up the iteration count
(*line)[ iteration ] = ch; // Add character to the array
iteration++; // Up the iteration count
}
/* }}}: Loop until we reach EOF or a newline */
if ( iteration == 0 )
{
return -1;
}
else
{
(*line)[ iteration ] = '\0'; // Make the array a string
return iteration;
}
}
/* }}}: fgetline() */
/* {{{: main() */
int main (void)
{
while ( 1 )
{
char *line = NULL;
unsigned int character_count;
if ( (character_count = fgetline(stdin, &line)) != -1 )
{
printf("%u\n", character_count);
}
else
{
break;
}
}
return EXIT_SUCCESS;
}
/* }}}: main() */
-- SNIP --
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top