newbie-question: container for a "string" of variable length

Discussion in 'C Programming' started by merman, Oct 18, 2004.

  1. merman

    merman Guest

    Hi,

    The problem:

    One function in my program reads line by line from a file (with fgets).
    Then I chomp (like Perl) every newline at the end of line:

    line[strlen(line)-1] = '\0';

    The result is a "string" of variable length (it depends from file-size).

    Is there a data-structure which dynamically grows for saving this
    "string" of variable length? How can solve my problem?

    Please keep the solution simple. I'm a newbie;-).

    Thanks for help.

    o-o

    Thomas
     
    merman, Oct 18, 2004
    #1
    1. Advertising

  2. merman wrote:
    > Hi,
    >
    > The problem:
    >
    > One function in my program reads line by line from a file (with fgets).
    > Then I chomp (like Perl) every newline at the end of line:
    >
    > line[strlen(line)-1] = '\0';


    The above is dangerous. Try something like
    {
    char *nl;
    if ((nl = strchr(line,'\n'))) *nl = 0;
    }

    >
    > The result is a "string" of variable length (it depends from file-size).
    >
    > Is there a data-structure which dynamically grows for saving this
    > "string" of variable length? How can solve my problem?


    malloc and strcpy (or strncpy) are your friends.
     
    Martin Ambuhl, Oct 18, 2004
    #2
    1. Advertising

  3. merman

    pete Guest

    merman wrote:
    >
    > Hi,
    >
    > The problem:
    >
    > One function in my program reads line by line from
    > a file (with fgets).
    > Then I chomp (like Perl) every newline at the end of line:
    >
    > line[strlen(line)-1] = '\0';
    >
    > The result is a "string" of variable length
    > (it depends from file-size).
    >
    > Is there a data-structure which dynamically grows for saving this
    > "string" of variable length? How can solve my problem?
    >
    > Please keep the solution simple. I'm a newbie;-).


    It seems like a job for a linked list.
    I don't know how simple that is for you.

    It makes things simpler if you have a hard coded line length limit.
    I use these functions to get the next nonblank line from a text file:

    #include <stdio.h>
    #include <ctype.h>

    #define LINE_LEN 65
    #define str(s) # s
    #define xstr(s) str(s)

    int nonblank_line(FILE *fd, char *line)
    {
    int rc;

    do {
    rc = fscanf(fd, "%" xstr(LINE_LEN) "[^\n]%*[^\n]", line);
    if (!feof(fd)) {
    getc(fd);
    }
    } while (rc == 0 || rc == 1 && blank(line));
    return rc;
    }

    int blank(char *line)
    {
    while (isspace(*line)) {
    ++line;
    }
    return *line == '\0';
    }

    line should be declared this way in the calling function:
    char line[LINE_LEN + 1];
    nonblank_line has two possible return values, EOF and 1.

    --
    pete
     
    pete, Oct 18, 2004
    #3
  4. Martin Ambuhl wrote:
    > merman wrote:
    >
    >> Hi,
    >>
    >> The problem:
    >>
    >> One function in my program reads line by line from a file (with
    >> fgets). Then I chomp (like Perl) every newline at the end of line:
    >>
    >> line[strlen(line)-1] = '\0';

    >
    >
    > The above is dangerous. Try something like
    > {
    > char *nl;
    > if ((nl = strchr(line,'\n'))) *nl = 0;
    > }

    Why would it be dangerous ?
    At any rate, man fgets
    "fgets() reads in at most one less than size characters ..."
    ....
    "A '\0' is stored after the last character in the buffer."
     
    =?ISO-8859-1?Q?=22Nils_O=2E_Sel=E5sdal=22?=, Oct 18, 2004
    #4
  5. merman

    Al Bowers Guest

    merman wrote:
    > Hi,
    >
    > The problem:
    >
    > One function in my program reads line by line from a file (with fgets).
    > Then I chomp (like Perl) every newline at the end of line:
    >
    > line[strlen(line)-1] = '\0';


    This would be bad if function strlen returned 0.
    Use function strrchr.

    #include <string.h>
    char *s1;
    if((s1 = strrchr(line,'\n'))!= NULL) *s1 = '\0';


    >
    > The result is a "string" of variable length (it depends from file-size).
    >
    > Is there a data-structure which dynamically grows for saving this
    > "string" of variable length? How can solve my problem?
    >
    > Please keep the solution simple. I'm a newbie;-).
    >


    Design a function, that uses function realloc that will
    dynamically allocate your need storage.

    A simple definition and useage is listed below.

    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>

    #define BLOCK 16

    char *fgetline(FILE *fp)
    {
    char *s, *tmp, buf[BLOCK];
    size_t count;

    for(count = 0, s = NULL; (fgets(buf,sizeof buf,fp));count++)
    {
    if((tmp = realloc(s,(count+1)*BLOCK)) == NULL)
    {
    free(s);
    return NULL;
    }
    s = tmp;
    if(count == 0) *s = '\0';
    strcat(s,buf);
    if((tmp = strrchr(s,'\n')) != NULL)
    {
    *tmp = '\0';
    break;
    }
    }
    return s;
    }

    int main(void)
    {
    char *mystring;

    printf("Enter a long sentence: ");
    fflush(stdout);
    mystring = fgetline();
    if(mystring) printf("mystring = \"%s\"\n", mystring);
    else puts("Failure with function fgetline");
    free(mystring);
    printf("\nLets try another. Enter another sentence: ");
    fflush(stdout);
    mystring = fgetline(stdin);
    if(mystring) printf("mystring = \"%s\"\n", mystring);
    else puts("Failure with function fgetline");
    free(mystring);
    return 0;
    }


    --
    Al Bowers
    Tampa, Fl USA
    mailto: (remove the x to send email)
    http://www.geocities.com/abowers822/
     
    Al Bowers, Oct 18, 2004
    #5
  6. merman

    Paul Hsieh Guest

    merman <> wrote:
    > The problem:
    >
    > One function in my program reads line by line from a file (with fgets).
    > Then I chomp (like Perl) every newline at the end of line:
    >
    > line[strlen(line)-1] = '\0';


    fgets() will not concatenate a '\n' if the buffer is filled to the
    limit or the file closes without a terminating '\n'. This would be
    kind of disappointing if strlen(line) were equal to 0.

    > The result is a "string" of variable length (it depends from file-size).
    >
    > Is there a data-structure which dynamically grows for saving this
    > "string" of variable length? How can solve my problem?


    The C language by itself is kind of a useless language for the
    behavior you want. This is a very frequently asked here, but of
    course its not addressed by the FAQ for this group.

    There are two main solutions that I can recommend. If you are
    concerned solely with the problem of variable length input, then I
    have written an article on the subject here:

    http://www.pobox.com/~qed/userInput.html

    If you have more general variable length string needs then you can use
    my "Better String Library" solution to deal with them:

    http://bstring.sf.net/

    The basic idea is that C by itself *always* requires that you know the
    length of the input before you read. But you can read input in fixed
    length sections, so you can use a strategy of allocating increasing
    amounts of memory interleaved with fetching blocks of input in an
    iterative manner. I personally endorse exponentially increasing
    successive block sizes (as can be seen in both solutions above) for
    speed, and heap pressure reasons.

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
     
    Paul Hsieh, Oct 18, 2004
    #6
  7. "Nils O. Selåsdal" <> writes:
    > Martin Ambuhl wrote:
    >> merman wrote:
    >>
    >>> Hi,
    >>>
    >>> The problem:
    >>>
    >>> One function in my program reads line by line from a file (with
    >>> fgets). Then I chomp (like Perl) every newline at the end of line:
    >>>
    >>> line[strlen(line)-1] = '\0';

    >>
    >>
    >> The above is dangerous. Try something like
    >> {
    >> char *nl;
    >> if ((nl = strchr(line,'\n'))) *nl = 0;
    >> }

    > Why would it be dangerous ?


    It's dangerous because if the input line is too long for the provided
    buffer, fgets() will give you a partial line whose last character is
    not a newline. Setting that character to '\0' can write over
    significant data.

    > At any rate, man fgets
    > "fgets() reads in at most one less than size characters ..."
    > ...
    > "A '\0' is stored after the last character in the buffer."


    Yes, the line is guaranteed to be terminated by a '\0' character. The
    point of the assignment is to replace the '\n' character preceding the
    '\0' with a '\0', making the string 1 character shorter. The danger
    is that the character preceding the '\0' may not be a '\n'.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Oct 18, 2004
    #7
  8. merman

    Eric Sosman Guest

    Nils O. Selåsdal wrote:
    > Martin Ambuhl wrote:
    >
    >>merman wrote:
    >>
    >>
    >>>Hi,
    >>>
    >>>The problem:
    >>>
    >>>One function in my program reads line by line from a file (with
    >>>fgets). Then I chomp (like Perl) every newline at the end of line:
    >>>
    >>>line[strlen(line)-1] = '\0';

    >>
    >>
    >>The above is dangerous. Try something like
    >> {
    >> char *nl;
    >> if ((nl = strchr(line,'\n'))) *nl = 0;
    >> }

    >
    > Why would it be dangerous ?


    Because if `line' is too short, fgets() will stop
    storing characters in it before it gets to the '\n':

    char line[10];
    fgets (line, sizeof line, stream);

    ... and the input is "supercalifragilisticexpialidocious\n".
    Chopping the final stored character without first making
    sure it's actually the '\n' gives you "supercal," wipes out
    the following 'i' irretrievably, and leaves you with no
    clue that the line isn't finished yet.

    Some implementations may permit the very last line in
    a file to omit its terminating '\n' altogether. This can
    make trouble for the blind chop even if `line' is big enough:

    char line[10737];
    fgets (line, sizeof line, stream);

    ... and the input is "supercalifragilisticexpialidocious"
    without a newline and followed by end-of-input. In this
    case, you'll get "supercalifragilisticexpialidociou" and
    lose the final 's'.

    --
     
    Eric Sosman, Oct 18, 2004
    #8
  9. merman

    merman Guest

    Hi,

    thanks for all the help. Wow - so much opinions;-).

    I think learning C needs a lot of time.

    Best regards

    o-o

    Thomas
     
    merman, Oct 18, 2004
    #9
  10. Nils O. Selåsdal wrote:

    > Martin Ambuhl wrote:
    >
    >> merman wrote:
    >>
    >>> Hi,
    >>>
    >>> The problem:
    >>>
    >>> One function in my program reads line by line from a file (with
    >>> fgets). Then I chomp (like Perl) every newline at the end of line:
    >>>
    >>> line[strlen(line)-1] = '\0';

    >>
    >>
    >>
    >> The above is dangerous. Try something like
    >> {
    >> char *nl;
    >> if ((nl = strchr(line,'\n'))) *nl = 0;
    >> }

    >
    > Why would it be dangerous ?
    > At any rate, man fgets
    > "fgets() reads in at most one less than size characters ..."
    > ...
    > "A '\0' is stored after the last character in the buffer."


    Because you have no guarantee of a '\n' in the string, so setting the
    last character of the string to 0 may not do what you want.
     
    Martin Ambuhl, Oct 18, 2004
    #10
  11. merman

    CBFalconer Guest

    Martin Ambuhl wrote:
    > merman wrote:
    >>
    >> One function in my program reads line by line from a file (with
    >> fgets). Then I chomp (like Perl) every newline at the end of line:
    >>
    >> line[strlen(line)-1] = '\0';

    >
    > The above is dangerous. Try something like
    > {
    > char *nl;
    > if ((nl = strchr(line,'\n'))) *nl = 0;
    > }
    >
    >> The result is a "string" of variable length (it depends from
    >> file-size.
    >>
    >> Is there a data-structure which dynamically grows for saving
    >> this "string" of variable length? How can solve my problem?

    >
    > malloc and strcpy (or strncpy) are your friends.


    Or use the techniques in ggets, avoiding the data copying. See:

    <http://cbfalconer.home.att.net/download/ggets.zip>

    --
    "I support the Red Sox and any team that beats the Yankees"

    "Any baby snookums can be a Yankee fan, it takes real moral
    fiber to be a Red Sox fan"
     
    CBFalconer, Oct 18, 2004
    #11
  12. merman

    CBFalconer Guest

    Paul Hsieh wrote:
    >

    .... snip ...
    >
    > The basic idea is that C by itself *always* requires that you know
    > the length of the input before you read. But you can read input
    > in fixed length sections, so you can use a strategy of allocating
    > increasing amounts of memory interleaved with fetching blocks of
    > input in an iterative manner. I personally endorse exponentially
    > increasing successive block sizes (as can be seen in both solutions
    > above) for speed, and heap pressure reasons.


    No, it doesn't require preknowledge of input length. C input is in
    the form of streams, so you can use getc (and putc) and never need
    to know the input stream length. Since getc is often available as
    a macro doing such may not even represent any inefficiency.

    One large advantage of doing so is that, combined with ungetc, you
    have the consistent option of 1 char read-ahead, which in turn
    solves many parsing problems.

    ISO Standard Pascal programmers have known this forever. Users of
    C'ified variations of Pascal, such as Borland and Turbo, do not.
    Yet properly used C can provide some of the advantages of Pascal.

    --
    "I support the Red Sox and any team that beats the Yankees"

    "Any baby snookums can be a Yankee fan, it takes real moral
    fiber to be a Red Sox fan"
     
    CBFalconer, Oct 18, 2004
    #12
  13. merman

    Richard Bos Guest

    Martin Ambuhl <> wrote:

    > merman wrote:
    > > Is there a data-structure which dynamically grows for saving this
    > > "string" of variable length? How can solve my problem?

    >
    > malloc and strcpy (or strncpy) are your friends.


    Actually, for a growing structure, I'd recommend realloc().

    Richard
     
    Richard Bos, Oct 19, 2004
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Carson
    Replies:
    3
    Views:
    858
    Carson
    Oct 4, 2005
  2. Mitchua
    Replies:
    5
    Views:
    2,747
    Eric J. Roode
    Jul 17, 2003
  3. Sam
    Replies:
    3
    Views:
    14,110
    Karl Seguin
    Feb 17, 2005
  4. Vivi Orunitia
    Replies:
    11
    Views:
    4,481
    Martijn Lievaart
    Feb 4, 2004
  5. Replies:
    5
    Views:
    667
    John W. Kennedy
    Jan 11, 2007
Loading...

Share This Page