Segfaulting when trying to create custom 'fgetline' function

Discussion in 'C Programming' started by Jeff Rodriguez, Nov 17, 2003.

  1. Main just loops over this while it's not null. The segfault occurs at
    this line:
    *line[ iteration ] = (char)ch;

    Also, please don't just fix the code. I would like to know why exactly
    this isn't working so I can avoid problems with it in the future.

    If there are any references I should check out let me know.

    Full highlighted code at:
    http://gurugeek.com/jeff/programming/fgetline/fgetline.c.html
    Line 62 is where segfault occurs.

    This only happens if I use 3 characters. Using 4+ characters results in
    a bus error.

    GDB Output:
    (gdb) run
    Starting program: /usr/home/jeff/public_html/programming/fgetline/a.out
    a
    as
    asd
    (no debugging symbols found)...(no debugging symbols found)...
    Program received signal SIGSEGV, Segmentation fault.
    0x804860f in fgetline ()
    (gdb) kill
    Kill the program being debugged? (y or n) y
    (gdb) run
    Starting program: /usr/home/jeff/public_html/programming/fgetline/a.out
    asdf
    (no debugging symbols found)...(no debugging symbols found)...
    Program received signal SIGBUS, Bus error.
    0x804860f in fgetline ()
    (gdb)



    -- SNIP --
    /* {{{ Code Fold: fgetline()
    */
    unsigned int fgetline( FILE *fp, char **line )
    {
    const unsigned int BUFFER_SIZE = 1024; /* Base buffer size
    */
    unsigned int iteration = 0; /* Number of characters
    read */
    char *tmp = NULL; /* Temp ptr to catch
    realloc() */
    int ch; /* Currently read character
    */

    /* {{{ Code Fold: Loop until we reach EOF or a newline
    */
    while ( (ch = fgetc(fp)) != '\n' && !feof(fp))
    {
    /* If there is no remainder for iteration / BUFFER_SIZE we've
    reached */
    /* a multiple of BUFFER_SIZE and it's time to reallocate the
    buffer */
    if ( !(iteration % BUFFER_SIZE) )
    {
    /* {{{ Code Fold: (Re)allocte memory
    */
    tmp = realloc(
    (*line),
    (iteration / BUFFER_SIZE + 1)
    * BUFFER_SIZE
    * sizeof(char)
    );
    if ( tmp == NULL )
    {
    puts("Memory (re)allocation failed\n");
    exit(EXIT_FAILURE);
    }

    /* If the address of our string changed, give *line that
    address */

    *line = tmp;

    /* }}} Code Fold: (Re)allocte memory
    */
    }

    *line[ iteration ] = (char)ch;
    iteration++;
    }
    /* }}} Code Fold: Loop until we reach EOF or a newline
    */

    // Null terminate the array to make it a C string
    line[ iteration + 1 ] = '\0';
    if ( iteration == 0 )
    {
    return NULL;
    }
    else
    {
    return iteration;
    }
    }
    /* }}} Code Fold: fgetline()
    */
    -- SNIP --
     
    Jeff Rodriguez, Nov 17, 2003
    #1
    1. Advertising

  2. Jeff Rodriguez

    Al Bowers Guest

    Jeff Rodriguez wrote:
    > Main just loops over this while it's not null. The segfault occurs at
    > this line:
    > *line[ iteration ] = (char)ch;
    >
    > Also, please don't just fix the code. I would like to know why exactly
    > this isn't working so I can avoid problems with it in the future.


    There are serveral errors.

    > If there are any references I should check out let me know.
    >
    > Full highlighted code at:
    > http://gurugeek.com/jeff/programming/fgetline/fgetline.c.html
    > Line 62 is where segfault occurs.
    >
    > This only happens if I use 3 characters. Using 4+ characters results in
    > a bus error.
    >
    > GDB Output:
    > (gdb) run
    > Starting program: /usr/home/jeff/public_html/programming/fgetline/a.out
    > a
    > as
    > asd
    > (no debugging symbols found)...(no debugging symbols found)...
    > Program received signal SIGSEGV, Segmentation fault.
    > 0x804860f in fgetline ()
    > (gdb) kill
    > Kill the program being debugged? (y or n) y
    > (gdb) run
    > Starting program: /usr/home/jeff/public_html/programming/fgetline/a.out
    > asdf
    > (no debugging symbols found)...(no debugging symbols found)...
    > Program received signal SIGBUS, Bus error.
    > 0x804860f in fgetline ()
    > (gdb)
    >
    >
    >
    > -- SNIP --
    > /* {{{ Code Fold: fgetline() */
    > unsigned int fgetline( FILE *fp, char **line )
    > {
    > const unsigned int BUFFER_SIZE = 1024; /* Base buffer size */
    > unsigned int iteration = 0; /* Number of characters
    > read */
    > char *tmp = NULL; /* Temp ptr to catch
    > realloc() */
    > int ch; /* Currently read character
    > */
    >
    > /* {{{ Code Fold: Loop until we reach EOF or a newline */
    > while ( (ch = fgetc(fp)) != '\n' && !feof(fp))
    > {
    > /* If there is no remainder for iteration / BUFFER_SIZE we've
    > reached */
    > /* a multiple of BUFFER_SIZE and it's time to reallocate the
    > buffer */
    > if ( !(iteration % BUFFER_SIZE) )
    > {
    > /* {{{ Code Fold: (Re)allocte memory */
    > tmp = realloc(
    > (*line),
    > (iteration / BUFFER_SIZE + 1)
    > * BUFFER_SIZE
    > * sizeof(char)
    > );


    This did not cause the seg. fault but there is a bug here.
    If you enter in BUFFER_SIZE characters and the the newline
    character, you have allocated BUFFER_SIZE space but later
    you try to write to BUFFER_SIZE+1 with the statement that
    adds the '\0' character.

    Change this to:
    tmp = reaaloc( *line, iteration + BUFFER_SIZE +1);

    > if ( tmp == NULL )
    > {
    > puts("Memory (re)allocation failed\n");
    > exit(EXIT_FAILURE);
    > }
    >
    > /* If the address of our string changed, give *line that
    > address */
    >
    > *line = tmp;
    >
    > /* }}} Code Fold: (Re)allocte memory */
    > }
    >
    > *line[ iteration ] = (char)ch;


    This is probably a cause of the seg. fault. Review the precedence
    rules and you will see that you need to force the * operator.
    (*line)[iteration] = ch;

    > iteration++;
    > }
    > /* }}} Code Fold: Loop until we reach EOF or a newline */
    >
    > // Null terminate the array to make it a C string
    > line[ iteration + 1 ] = '\0';


    (*line)[iteration] = '\0';

    But still there is a problem. If you just type in the newline
    character, iteration stays 0, no allocations are made and still
    you attempt to assign the '\0' character.

    > if ( iteration == 0 )
    > {
    > return NULL;


    You designed the function to return type unsigned. Why to you
    return NULL?

    > }
    > else
    > {
    > return iteration;
    > }
    > }
    > /* }}} Code Fold: fgetline() */
    > -- SNIP --
    >


    With these corrects, I believe the function will work.

    #include <stdlib.h>
    #include <stdio.h>

    unsigned int fgetline( FILE *fp, char **line )
    {
    const unsigned int BUFFER_SIZE = 3;
    unsigned int iteration = 0;
    char *tmp = NULL;
    int ch;

    while ( (ch = fgetc(fp)) != '\n' && !feof(fp))
    {
    if ( !(iteration % BUFFER_SIZE) )
    {
    tmp = realloc(*line,iteration+BUFFER_SIZE+1);
    if ( tmp == NULL )
    {
    puts("Memory (re)allocation failed\n");
    exit(EXIT_FAILURE);
    }
    *line = tmp;
    }
    (*line)[ iteration ] = (char)ch;
    iteration++;
    }
    if ( iteration == 0 )
    {
    return 0;
    }
    else
    {
    (*line)[ iteration] = '\0';
    return iteration;
    }
    }

    int main (void)
    {
    while ( 1 )
    {
    char *line = NULL;
    if ( fgetline(stdin, &line) )
    {
    printf("%s\n", line);
    free(line);
    line = NULL;
    }
    else
    {
    break;
    }
    }
    return EXIT_SUCCESS;
    }


    --
    Al Bowers
    Tampa, Fl USA
    mailto: (remove the x to send email)
    http://www.geocities.com/abowers822/
     
    Al Bowers, Nov 17, 2003
    #2
    1. Advertising

  3. On Sun, 16 Nov 2003 19:55:14 -0700, Jeff Rodriguez wrote:

    > Also, please don't just fix the code. I would like to know why exactly
    > this isn't working so I can avoid problems with it in the future.


    Ok. It's not working because you aren't paying attention to the
    precedence of operators

    ....

    Sorry, I'm going to have to fix the code. Ignore the rest
    if you don't want to see your error.

    >/* {{{ Code Fold: fgetline() */
    >unsigned int fgetline( FILE *fp, char **line )
    >{
    > const unsigned int BUFFER_SIZE = 1024; /* Base buffer size */
    > unsigned int iteration = 0; /* Number of characters ... */
    > char *tmp = NULL; /* Temp ptr to catch ... */
    > int ch; /* Currently read character */
    >
    > /* {{{ Code Fold: Loop until we reach EOF or a newline */
    > while ( (ch = fgetc(fp)) != '\n' && !feof(fp))
    > {
    > /* If there is no remainder for iteration / BUFFER_SIZE ... */
    > /* a multiple of BUFFER_SIZE and it's time to reallocate the ... */


    This also allocates a buffer the first time through the loop. I'm sure
    you are aware of that, but your comment doesn't mention it.

    > if ( !(iteration % BUFFER_SIZE) )
    > {
    > /* {{{ Code Fold: (Re)allocte memory */
    > tmp = realloc(
    > (*line),
    > (iteration / BUFFER_SIZE + 1)
    > * BUFFER_SIZE
    > * sizeof(char)
    > );


    sizeof(char) == 1 by definition, there's no need to use it here. If
    you think you're preparing for some future change in type of the
    buffers you are allocating then do it like this:

    tmp = realloc((*line),
    (iteration / BUFFER_SIZE + 1)
    * BUFFER_SIZE * sizeof *tmp);

    > if ( tmp == NULL )
    > {
    > puts("Memory (re)allocation failed\n");
    > exit(EXIT_FAILURE);
    > }
    >
    > /* If the address of our string changed, give *line that address */
    >
    > *line = tmp;
    >
    > /* }}} Code Fold: (Re)allocte memory */
    > }
    >
    > *line[ iteration ] = (char)ch;


    Here is the problem. This means

    *(line[iteration]) = (char)ch;

    So you are treating line as an array of pointers, selecting the
    "iteration'th" pointer from that array and storing the char where
    ever it points. What you meant to do was

    (*line)[iteration] = (char)ch;

    But you could have avoided the problem in the first place by
    simply using the /other/ pointer to your buffer:

    tmp[iteration] = (char)ch;

    > iteration++;
    > }
    > /* }}} Code Fold: Loop until we reach EOF or a newline */
    >
    > /* Null terminate the array to make it a C string */
    > line[ iteration + 1 ] = '\0';
    > if ( iteration == 0 )
    > {
    > return NULL;


    Why are you returning NULL from a function that is declared to
    return unsigned int?

    > }
    > else
    > {
    > return iteration;
    > }
    >}
     
    Sheldon Simms, Nov 17, 2003
    #3
  4. Jeff Rodriguez <> wrote:
    >Main just loops over this while it's not null. The segfault
    >occurs at this line:
    >*line[ iteration ] = (char)ch;
    >
    >Also, please don't just fix the code. I would like to know why
    >exactly this isn't working so I can avoid problems with it in
    >the future.


    ....

    Your code is formatted in a particularly ugly style which makes
    it very hard to read (and it gets worse when you post it with
    software that folds long lines!). Plus it is impossible, from
    the example given, to tell just how you are calling this
    function. (That all sounds like a bunch of nitpicking, but it
    really does make debugging easier when you can *see* what the
    code is doing, in blocks the size of your monitor's screen.)

    >/* {{{ Code Fold: fgetline() */


    Using a different indenting style would eliminate the need for
    the "Code Fold" business...

    >unsigned int fgetline( FILE *fp, char **line )
    > {
    > const unsigned int BUFFER_SIZE = 1024; /* Base buffer size
    > */


    While 1024 is probably a reasonable size for the ultimate
    implementation, you might want to make that something fairly
    small, say 10, for testing. That way you can easily check
    the boundaries.

    > unsigned int iteration = 0; /* Number of
    > characters read */
    > char *tmp = NULL; /* Temp ptr to
    > catch realloc() */
    > int ch; /* Currently read
    > character */
    >
    >/* {{{ Code Fold: Loop until we reach EOF or a newline */
    > while ( (ch = fgetc(fp)) != '\n' && !feof(fp))


    Looping on feof() is not good. First, your loop will never
    notice a read error, but also the feof() function doesn't
    indicate that you are at the end of a file, it indicates
    that you've tried to read past the end of a file, which comes
    one iteration after you've reached the end.

    while ((ch = fgetc(fp)) != '\n' && ch != EOF) { ... }

    This will catch both an end of file condition and a read error.
    If you want to know why the loop exited, test for ch == '\n',
    feof(), and ferror() following the loop.

    > {
    >/* If there is no remainder for iteration / BUFFER_SIZE
    >we've reached */
    >/* a multiple of BUFFER_SIZE and it's time to reallocate
    >the buffer */
    > if ( !(iteration % BUFFER_SIZE) )
    > {
    >/* {{{ Code Fold: (Re)allocte memory */
    > tmp = realloc(
    > (*line),
    > (iteration / BUFFER_SIZE + 1)
    > * BUFFER_SIZE
    > * sizeof(char)


    There is no point in multiplying by "sizeof char", because that
    is guaranteed by the C Standard to be 1.

    > );


    > if ( tmp == NULL )
    > {
    > puts("Memory (re)allocation failed\n");
    > exit(EXIT_FAILURE);
    > }
    >
    >/* If the address of our string changed, give *line
    >that address */
    >
    > *line = tmp;
    >
    >/* }}} Code Fold: (Re)allocte memory */
    > }
    >
    > *line[ iteration ] = (char)ch;


    The cast is unnecessary, but this isn't doing what you think it
    is. You want to get the iteration element of the dereferenced
    pointer from line, but instead you are getting the iteration
    element of line and dereferencing that... which is why it
    segfaults.

    (*line)[iteration] = ch;

    Note that that might be cleaner to just use the tmp variable,

    tmp[iteration] = ch;

    > iteration++;


    Note that you have just incremented the interation count to the
    next uninitialized element in the allocated space...

    > }
    >/* }}} Code Fold: Loop until we reach EOF or a newline */
    >
    > // Null terminate the array to make it a C string
    > line[ iteration + 1 ] = '\0';


    .... and here you skip past that last unallocated element in your
    memory space, and put a null character one past that.

    And you've make a typo that changes what it means too!

    (*line)[iteration] = '\0';

    Again, the tmp variable is easier to see,

    tmp[iteration] = '\0';

    > if ( iteration == 0 )
    > {
    > return NULL;


    NULL is a pointer, this function is defined to return a type
    unsigned int. Change that to 0 and enable enough compiler
    warnings to let it tell you about such problems.

    However, since iteration is 0, why not just eliminate the
    entire if-else, and just return iteration.

    > }
    > else
    > {
    > return iteration;
    > }
    > }
    >/* }}} Code Fold: fgetline() */
    >-- SNIP --


    Here is your code, reformatted (for example, with shorter
    indents and with lines kept to less that 64 columns) with
    corrections and with a main() so that it can be compiled.

    #include <stdio.h>
    #include <stdlib.h>

    unsigned int fgetline(FILE *, char **);

    int main(void)
    {
    char *txt;
    int count;

    while (1) {
    txt = NULL;
    count = fgetline(stdin, &txt);
    printf("txt [%u]: <%s>\n", count, txt);
    free(txt);
    }

    return EXIT_SUCCESS;
    }

    unsigned int
    fgetline( FILE *fp, char **line )
    {
    int ch; /* current char read */
    char *tmp = NULL; /* realloc() tmp ptr */
    unsigned int iteration = 0; /* char read count */
    const unsigned int BUFFER_SIZE = 10; /* Base buffer size */

    /* Loop until we reach EOF or a newline */
    while ( (ch = fgetc(fp)) != '\n' && !feof(fp)) {
    /*
    * If there is no remainder for iteration / BUFFER_SIZE
    * we've reached a multiple of BUFFER_SIZE and it's time
    * to reallocate the buffer
    */
    if ( !(iteration % BUFFER_SIZE) ) {
    /* realloc the input buffer */
    tmp = realloc(
    *line,
    (iteration / BUFFER_SIZE + 1) * BUFFER_SIZE);

    if (tmp == NULL) {
    puts("Memory (re)allocation failed\n");
    exit(EXIT_FAILURE);
    }

    *line = tmp; /* new buffer address */
    }
    tmp[iteration] = ch;
    iteration++;
    }
    tmp[iteration] = '\0'; /* make it a string */
    return iteration;
    }

    --
    Floyd L. Davidson <http://web.newsguy.com/floyd_davidson>
    Ukpeagvik (Barrow, Alaska)
     
    Floyd Davidson, Nov 17, 2003
    #4
  5. Jeff Rodriguez wrote:

    Alright, I've taken everyone's comments into consideration and modified
    the code. I kept parts of the code folding, but the text " Code Fold"
    has been removed to hopefully improve readability.

    Now I'm basically submitting for review, and the use of anyone who finds it
    useful. It seems relatively quick:

    -- TEST --
    jeff@andromeda% gcc -w -g -O fgetline.c ; time ./a.out < testfile
    1000
    1024
    1025
    451365
    0.268u 0.061s 0:00.33 96.9% 23+1005k 0+0io 0pf+0w
    -- TEST --

    0.33 seconds for a 443k file is pretty good right?

    HTMLized at: http://gurugeek.com/jeff/programming/fgetline/fgetline.c.html
    -- SNIP --
    /* {{{: Documentation */
    /*============================================================================*/
    /* FGETLINE UTILITY FUNCTION */
    /*----------------------------------------------------------------------------*/
    /* Function: unsigned int fgetline( FILE *fp, char **line ) */
    /* */
    /* Operations: Read a line delimited by '\n' from open file at fp */
    /* */
    /* Returns: Number of characters read, including trailing newline character */
    /* -1 if 0 characters are read, or if EOF is found */
    /* */
    /* */
    /* */
    /* History: */
    /* Date V.M Description Who */
    /* -------- ---- -------------------------------------------------- --- */
    /* 03-11-15 1.00 Initial implementation JAR */
    /* */
    /*----------------------------------------------------------------------------*/
    /* Copyright (C) 2003 Jeff A. Rodriguez - All rights reserved. */
    /*============================================================================*/
    /* }}}: Documentation */
    /* {{{: Includes */
    #include <stdlib.h> /* malloc(), realloc() */
    #include <stdio.h> /* fgetc() */
    /* }}}: Includes */
    /* {{{: fgetline() */
    unsigned int fgetline( FILE *fp, char **line )
    {
    const unsigned int BUFFER_SIZE = 1024; /* Base buffer size */
    unsigned int iteration = 0; /* Number of characters read */
    char *tmp = NULL; /* Temp ptr to catch realloc() */
    int ch; /* Currently read character */

    /* {{{: Loop until we reach EOF or a newline */
    while ((ch = fgetc(fp)) != '\n' && ch != EOF)
    {
    /* If there is no remainder for iteration / BUFFER_SIZE we've reached
    a multiple of BUFFER_SIZE and it's time to reallocate the buffer */
    if ( !(iteration % BUFFER_SIZE) )
    {
    /* {{{: (Re)allocate memory */
    /* realloc() acts like malloc() here the first time through. Also
    we make sure there's enough room for the \0 character at the end
    to make this a string */
    tmp = realloc(*line, (iteration / BUFFER_SIZE + 1) * BUFFER_SIZE);
    if ( tmp == NULL )
    {
    puts("Memory (re)allocation failed");
    exit(EXIT_FAILURE);
    }
    *line = tmp; // Make sure the address of *line is correct
    /* }}}: (Re)allocate memory */
    }

    // Up the iteration count
    (*line)[ iteration ] = ch; // Add character to the array
    iteration++; // Up the iteration count
    }
    /* }}}: Loop until we reach EOF or a newline */
    if ( iteration == 0 )
    {
    return -1;
    }
    else
    {
    (*line)[ iteration ] = '\0'; // Make the array a string
    return iteration;
    }
    }
    /* }}}: fgetline() */
    /* {{{: main() */
    int main (void)
    {
    while ( 1 )
    {
    char *line = NULL;
    unsigned int character_count;
    if ( (character_count = fgetline(stdin, &line)) != -1 )
    {
    printf("%u\n", character_count);
    }
    else
    {
    break;
    }
    }
    return EXIT_SUCCESS;
    }
    /* }}}: main() */
    -- SNIP --
     
    Jeff Rodriguez, Nov 17, 2003
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Geoffrey Clements

    Eric3 segfaulting

    Geoffrey Clements, Jul 5, 2003, in forum: Python
    Replies:
    1
    Views:
    322
    Geoffrey Clements
    Jul 5, 2003
  2. Fernando Perez
    Replies:
    3
    Views:
    343
    Fernando Perez
    Jan 25, 2005
  3. Antoninus Twink

    Bug/Gross InEfficiency in HeathField's fgetline program

    Antoninus Twink, Oct 7, 2007, in forum: C Programming
    Replies:
    436
    Views:
    6,308
    user923005
    Nov 13, 2007
  4. Richard Harter

    Designing fgetline - a perspective

    Richard Harter, Oct 12, 2007, in forum: C Programming
    Replies:
    48
    Views:
    1,266
    David Thompson
    Nov 25, 2007
  5. Dik T. Winter
    Replies:
    2
    Views:
    373
    Charlie Gordon
    Nov 7, 2007
Loading...

Share This Page