K&R2, exercise 4-2

Discussion in 'C Programming' started by arnuld, Apr 3, 2008.

  1. arnuld

    arnuld Guest

    STATEMENT: Write the function strindex(s, t), which returns the position of the
    rightmost occurence of t in s, or -1 if there is none.

    PURPOSE: this program prints the index of the rightmost match on the line.
    The match we have to find is a char array named pattern. We also print out
    the number of matches we have found. We will take the input from
    command-line.


    PROBLEM: Segmentation Fault

    The programe compiles fine but at run-time it segfaults :(


    here is my solution which is a little modified version of the example
    provided in the same section:

    /* Exercise # 4.1 */



    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    enum MAXSIZE { ARR_SIZE = 1000 };

    int getline( char current_line[], int max );
    int str_index( char current_line[], char search_for[] );

    char pattern[] = "term";

    int match_num;


    int main( void )
    {
    char current_line[ARR_SIZE];
    int matched_idx, idx;

    idx = 0;


    while( getline(current_line, ARR_SIZE) > 0 )
    {
    if( (matched_idx = str_index(current_line, pattern)) >= 0 )
    {
    printf("\n%d matches, \n%d is the last to match", match_num, matched_idx);
    }
    }

    return 0;

    }


    /* takes a line as input and returns the length of the line */
    int getline( char s[], int max )
    {
    int c, i;

    for( i = 0; ( (c = getchar()) != EOF && (c != '\n') && (--max > 0) ); ++i )
    {
    s = c;
    }

    if( c == '\n' )
    {
    s[i++] = '\n';
    }

    s = '\0';

    return i;
    }


    /* search for a pattern in the line, will save every index position of
    source string where the pattern starts to match. For string the index we
    use an array of ints. we then return the last element of array which is the
    index of the rightmost match. We also return the number of matches we have
    found using an int match_num.
    */
    int str_index( char s[], char p[] )
    {
    int i, j, k;
    int idx, last_match;
    int saved_pos[ARR_SIZE];
    memset( saved_pos, '\0', sizeof( saved_pos ));

    idx = 0;
    match_num = 0;


    for( i = 0; s != '\0'; ++i )
    {
    for( k = 0, j = i; p[k] != '\0' ; ++k, ++j )
    {
    if( s[j] != p[k] )
    {
    break;
    }
    }

    if( (k > 0) && p[k] == '\0' )
    {
    saved_pos[idx] = i;
    ++match_num;
    }
    }

    last_match = sizeof(saved_pos) - 2;
    /* it checks whether we have any match or not. If we do not have any match
    * then we have nothing in array */
    if( saved_pos[0] )
    {
    return saved_pos[last_match];
    }
    else
    {
    return -1;
    }
    }


    =========== OUTPUT =============
    /home/arnuld/programs/C $ gcc -ansi -pedantic -Wall -Wextra 4-1.c
    /home/arnuld/programs/C $ ./a.out
    terminal
    and no one
    and this
    term
    and Xterminal
    segmentation fault
    /home/arnuld/programs/C $
     
    arnuld, Apr 3, 2008
    #1
    1. Advertising

  2. arnuld

    arnuld Guest

    > On Thu, 03 Apr 2008 19:14:49 +0530, arnuld wrote:

    > last_match = sizeof(saved_pos) - 2;
    > /* it checks whether we have any match or not. If we do not have any
    > match
    > * then we have nothing in array */
    > if( saved_pos[0] )
    > {
    > return saved_pos[last_match];
    > }


    that was the source of trouble. I changed it to this:


    if( saved_pos[0] )
    {
    return saved_pos[match_num - 1];
    }



    and I got a new bug:


    [arnuld@raj C]$ gcc -ansi -pedantic -Wall -Wextra 4-1.c
    [arnuld@raj C]$ ./a.out
    and no TERM
    term
    and term

    1 matches,
    4 is the last to match
    how about this tErm
    nope
    the terminal is the xterm and urxvt term

    3 matches,
    0 is the last to match[arnuld@raj C]$


    whats the problem now ?
     
    arnuld, Apr 3, 2008
    #2
    1. Advertising

  3. Richard Heathfield <> writes:

    > arnuld said:
    >
    > <snip>
    >
    >> PROBLEM: Segmentation Fault
    >>
    >> The programe compiles fine but at run-time it segfaults :(

    >
    > <snip>
    >
    >> int str_index( char s[], char p[] )
    >> {
    >> int i, j, k;
    >> int idx, last_match;
    >> int saved_pos[ARR_SIZE];

    >
    > <snip>
    >>
    >> last_match = sizeof(saved_pos) - 2;

    >
    > This doesn't do what you think. sizeof saved_pos gives the size, in bytes,
    > of the array. That's fine if ints are 1 byte in size, which they can be,
    > but it doesn't seem to be the case on your system.
    >
    > What you meant to write was:
    >
    > last_match = sizeof saved_pos / sizeof saved_pos[0] - 2;
    >
    > This fixes your problem.


    I am not sure if you are joking. It fixes the problem only for some
    very limited meanings of either "fix" or "problem".

    --
    Ben.
     
    Ben Bacarisse, Apr 3, 2008
    #3
  4. arnuld <> writes:

    >> On Thu, 03 Apr 2008 19:14:49 +0530, arnuld wrote:

    >
    >> last_match = sizeof(saved_pos) - 2;
    >> /* it checks whether we have any match or not. If we do not have any
    >> match
    >> * then we have nothing in array */
    >> if( saved_pos[0] )
    >> {
    >> return saved_pos[last_match];
    >> }

    >
    > that was the source of trouble. I changed it to this:
    >
    >
    > if( saved_pos[0] )
    > {
    > return saved_pos[match_num - 1];
    > }
    >
    >
    >
    > and I got a new bug:


    You have a few. I'll take only one that I can illustrate from the
    fragment above: you can't tell if there is or is not a match by
    testing saved_pos[0] because there match might be a first match at
    position zero. Now, as it happens, you store all the match positions
    in index 0 of this array because you don't increment idx (not shown)
    so this bug only shows up when the first and last match are at
    position zero.

    I would suggest you don't try to store the matches. Try to find a way
    that does not involve remembering them all, just to return the
    right-most one.

    --
    Ben.
     
    Ben Bacarisse, Apr 3, 2008
    #4
  5. arnuld

    arnuld Guest

    > On Thu, 03 Apr 2008 19:22:29 +0530, Richard Heathfield wrote:


    > This doesn't do what you think. sizeof saved_pos gives the size, in
    > bytes, of the array. That's fine if ints are 1 byte in size, which they
    > can be, but it doesn't seem to be the case on your system.


    your way/idea of telling the problem is very clear. I never expected that
    from the author of a book like "C Unleashed". NO, I have not read the book
    but heard about its toughness and harder ideas.


    > What you meant to write was:
    >
    > last_match = sizeof saved_pos / sizeof saved_pos[0] - 2;
    >
    > This fixes your problem.


    I got a new one:

    [arnuld@raj C]$ ./a.out
    term
    term
    Xterm

    1 matches,
    Position 1 is the last to match

    [arnuld@raj C]$ ./a.out
    this terminal

    1 matches,
    Position 1 is the last to match
    [arnuld@raj C]$


    look at the output. program is not reading the position 0 and always saying
    that position 1 is th elast to match. I think I have modified the file a
    little. you can check the whole source here:

    http://pastebin.ca/969121

    > And in case anyone's curious, the only use I made of gdb for finding
    > this bug was to get a backtrace, from which the problem was immediately
    > obvious.


    Oh... I never used gdb. I think I need to read the manual to start using
    it.
     
    arnuld, Apr 3, 2008
    #5
  6. arnuld

    arnuld Guest

    > On Thu, 03 Apr 2008 19:46:34 +0530, Ben Bacarisse wrote:

    > You have a few. I'll take only one that I can illustrate from the
    > fragment above: you can't tell if there is or is not a match by testing
    > saved_pos[0] because there match might be a first match at position
    > zero. Now, as it happens, you store all the match positions in index 0
    > of this array because you don't increment idx (not shown) so this bug
    > only shows up when the first and last match are at position zero.
    >
    > I would suggest you don't try to store the matches. Try to find a way
    > that does not involve remembering them all, just to return the
    > right-most one.



    HEY.. thanks :) , how about this, works perfectly fine. Tell me if I have
    any more bugs left:



    /* Exercise # 4.1 */



    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>

    enum MAXSIZE { ARR_SIZE = 1000 };

    int getline( char current_line[], int max );
    int str_index( char current_line[], char search_for[] );

    char pattern[] = "term";

    int match_num;


    int main( void )
    {
    char current_line[ARR_SIZE];
    int matched_idx, idx;

    idx = 0;


    while( getline(current_line, ARR_SIZE) > 0 )
    {
    if( (matched_idx = str_index(current_line, pattern)) >= 0 )
    {
    printf("\n%d matches -- ", match_num);
    if( match_num )
    {
    printf("Position %d is the last to match\n\n", matched_idx);
    }
    }
    }

    return 0;

    }


    /* takes a line as input and returns the length of the line */
    int getline( char s[], int max )
    {
    int c, i;

    for( i = 0; ( (c = getchar()) != EOF && (c != '\n') && (--max > 0) ); ++i )
    {
    s = c;
    }

    if( c == '\n' )
    {
    s[i++] = '\n';
    }

    s = '\0';

    return i;
    }


    /* search for a pattern in the line, will save every index position of source
    string where the pattern starts to match. For string the index we use an
    array of ints. we then return the last element of array which is the
    index of the rightmost match. We also return the number of matches we have
    found using an int match_num.
    */
    int str_index( char s[], char p[] )
    {
    int i, j, k;
    int idx, match_pos;


    idx = 0;
    match_num = 0;


    for( i = 0; s != '\0'; ++i )
    {
    for( k = 0, j = i; p[k] != '\0' ; ++k, ++j )
    {
    if( s[j] != p[k] )
    {
    break;
    }
    }

    if( (k > 0) && p[k] == '\0' )
    {
    ++match_num;
    match_pos = i;
    }
    }



    /* it checks whether we have any match or not. */
    if( match_num )
    {
    return match_pos;
    }
    else
    {
    return -1;
    }
    }


    ========== OUTPUT ============
    [arnuld@raj C]$ gcc -ansi -pedantic -Wall -Wextra -ggdb 4-1.c
    [arnuld@raj C]$ ./a.out
    and this
    and that
    9843788327#&&$^$^$^^$TER%%^
    TERM
    term

    1 matches -- Position 0 is the last to match

    term and term and term\0

    3 matches -- Position 18 is the last to match

    [arnuld@raj C]$
     
    arnuld, Apr 3, 2008
    #6
  7. Richard Heathfield <> writes:

    > Ben Bacarisse said:
    >
    >> Richard Heathfield <> writes:
    >>

    > <snip>
    >>>
    >>> What you meant to write was:
    >>>
    >>> last_match = sizeof saved_pos / sizeof saved_pos[0] - 2;
    >>>
    >>> This fixes your problem.

    >>
    >> <snip> It fixes the problem only for some
    >> very limited meanings of either "fix" or "problem".

    >
    > Well, okay, it fixes the segfault, anyway.


    Right. I thought that is maybe what you meant.

    --
    Ben.
     
    Ben Bacarisse, Apr 3, 2008
    #7
  8. arnuld <> writes:

    >> On Thu, 03 Apr 2008 19:46:34 +0530, Ben Bacarisse wrote:

    <snip>
    >> I would suggest you don't try to store the matches. Try to find a way
    >> that does not involve remembering them all, just to return the
    >> right-most one.

    >
    > HEY.. thanks :) , how about this, works perfectly fine. Tell me if I have
    > any more bugs left:


    The str_index function looks good to me but there is a bug in the input
    routine. If made a few other comments...

    > /* Exercise # 4.1 */
    >
    >
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    > #include <string.h>
    >
    > enum MAXSIZE { ARR_SIZE = 1000 };
    >
    > int getline( char current_line[], int max );
    > int str_index( char current_line[], char search_for[] );
    >
    > char pattern[] = "term";
    >
    > int match_num;
    >
    >
    > int main( void )
    > {
    > char current_line[ARR_SIZE];
    > int matched_idx, idx;
    >
    > idx = 0;
    >
    >
    > while( getline(current_line, ARR_SIZE) > 0 )
    > {
    > if( (matched_idx = str_index(current_line, pattern)) >= 0 )
    > {
    > printf("\n%d matches -- ", match_num);


    It is more reliable to get into the habit of ending output with
    newlines rather than starting it with then. In your case...

    > if( match_num )
    > {
    > printf("Position %d is the last to match\n\n", matched_idx);
    > }


    .... when there is no match, the output is "left hanging". Output that
    does not end with a newline is guaranteed to appear on all systems.

    > }
    > }
    >
    > return 0;
    >
    > }
    >
    >
    > /* takes a line as input and returns the length of the line */
    > int getline( char s[], int max )
    > {
    > int c, i;
    >
    > for( i = 0; ( (c = getchar()) != EOF && (c != '\n') && (--max > 0) ); ++i )
    > {
    > s = c;
    > }


    For simple testing, I prefer to read a whole line, storing only those
    characters that fit but this way is fine. Except...

    > if( c == '\n' )
    > {
    > s[i++] = '\n';
    > }
    >
    > s = '\0';


    .... I think you have a bug here. The loop above can put max-1 chars
    into the buffer (max being what it was originally, of course) but it
    can end when c == '\n'. You then try to put the '\n' and the '\0' in
    the one remaining space!

    > return i;
    > }
    >
    >
    > /* search for a pattern in the line, will save every index position of source
    > string where the pattern starts to match. For string the index we use an
    > array of ints. we then return the last element of array which is the
    > index of the rightmost match. We also return the number of matches we have
    > found using an int match_num.
    > */
    > int str_index( char s[], char p[] )
    > {
    > int i, j, k;
    > int idx, match_pos;
    >
    >
    > idx = 0;


    No used.

    > match_num = 0;


    OK, maybe you do this for testing, but a utility function like this
    should not use a file-scope (AKA "global") variable like match_num.
    Of course, you code works if you just make match_num local to this
    function.

    > for( i = 0; s != '\0'; ++i )
    > {
    > for( k = 0, j = i; p[k] != '\0' ; ++k, ++j )
    > {
    > if( s[j] != p[k] )
    > {
    > break;
    > }
    > }
    >
    > if( (k > 0) && p[k] == '\0' )
    > {
    > ++match_num;
    > match_pos = i;
    > }
    > }
    >
    >
    >
    > /* it checks whether we have any match or not. */
    > if( match_num )
    > {
    > return match_pos;
    > }
    > else
    > {
    > return -1;
    > }
    > }


    Looks good to me.

    --
    Ben.
     
    Ben Bacarisse, Apr 3, 2008
    #8
  9. Ben Bacarisse <> writes:
    [...]
    > ... when there is no match, the output is "left hanging". Output that
    > does not end with a newline is guaranteed to appear on all systems.

    [...]

    You mean "is not guaranteed".

    --
    Keith Thompson (The_Other_Keith) <>
    Nokia
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Apr 3, 2008
    #9
  10. arnuld

    Default User Guest

    arnuld wrote:

    > > On Thu, 03 Apr 2008 19:22:29 +0530, Richard Heathfield wrote:

    >
    >
    > > This doesn't do what you think. sizeof saved_pos gives the size, in
    > > bytes, of the array. That's fine if ints are 1 byte in size, which
    > > they can be, but it doesn't seem to be the case on your system.

    >
    > your way/idea of telling the problem is very clear. I never expected
    > that from the author of a book like "C Unleashed". NO, I have not
    > read the book but heard about its toughness and harder ideas.


    Wow! Talk about your backhanded compliments.




    Brian
     
    Default User, Apr 3, 2008
    #10
  11. Keith Thompson <> writes:

    > Ben Bacarisse <> writes:
    > [...]
    >> ... when there is no match, the output is "left hanging". Output that
    >> does not end with a newline is guaranteed to appear on all systems.

    > [...]
    >
    > You mean "is not guaranteed".


    Yes, thanks. No point in my saying I'll proof read more since I've
    said that before and it is obviously not working!

    --
    Ben.
     
    Ben Bacarisse, Apr 3, 2008
    #11
  12. arnuld

    arnuld Guest

    > On Thu, 03 Apr 2008 21:17:30 +0530, Ben Bacarisse wrote:

    >> arnuld <> writes:


    >> if( match_num )
    >> {
    >> printf("Position %d is the last to match\n\n", matched_idx);
    >> }
    >> }


    > ... when there is no match, the output is "left hanging". Output that
    > does not end with a newline is guaranteed to appear on all systems.


    I really don't get it at all. If there is no match then condition will
    fail and hence othing to do.


    > For simple testing, I prefer to read a whole line, storing only those
    > characters that fit but this way is fine. Except...



    I am at chapter 4 where authors did not start discussing the "whole line"
    concept except using a rudimentary function named getline() to do that.


    >> if( c == '\n' )
    >> {
    >> s[i++] = '\n';
    >> }
    >> }
    >> s = '\0';


    > ... I think you have a bug here. The loop above can put max-1 chars
    > into the buffer (max being what it was originally, of course) but it can
    > end when c == '\n'. You then try to put the '\n' and the '\0' in the
    > one remaining space!


    sorry, this function is just a copy of K&R2 example, as I said earlier.
    so, something wrong with K&R2 ?


    >> match_num = 0;


    > OK, maybe you do this for testing, but a utility function like this
    > should not use a file-scope (AKA "global") variable like match_num. Of
    > course, you code works if you just make match_num local to this
    > function.


    I have used match_num in main() function and that is why I made it global.
    If it is local then main() is not going to access it.
     
    arnuld, Apr 4, 2008
    #12
  13. arnuld <> writes:

    >> On Thu, 03 Apr 2008 21:17:30 +0530, Ben Bacarisse wrote:

    >
    >>> arnuld <> writes:

    >
    >>> if( match_num )
    >>> {
    >>> printf("Position %d is the last to match\n\n", matched_idx);
    >>> }
    >>> }

    >
    >> ... when there is no match, the output is "left hanging". Output that
    >> does not end with a newline is guaranteed to appear on all systems.

    >
    > I really don't get it at all. If there is no match then condition will
    > fail and hence othing to do.


    No, you are right. I based my comment on an old version that
    sometimes did this:

    ./arnuld
    a term
    ^D
    1 matches,
    0 is the last to match$

    The $ is my system's prompt. Your new fixed version prints nothing if
    there is no match, but it takes a while to reason that out. You
    should probably remove the test "if (match_num)" since it suggests
    that a partial line can be printed. The test is redundant.

    >> For simple testing, I prefer to read a whole line, storing only those
    >> characters that fit but this way is fine. Except...

    >
    >
    > I am at chapter 4 where authors did not start discussing the "whole line"
    > concept except using a rudimentary function named getline() to do
    > that.


    I was suggesting an alternative, that is all. You do, essentially,
    this:

    while (there-is-room && next-char-is-not-newline)
    store-that-char;

    I often write:

    while (next-char-is-not-newline)
    if (there-is-room)
    store-that-char;

    so that a whole line is always read, even when there is not room for
    it all.

    >>> if( c == '\n' )
    >>> {
    >>> s[i++] = '\n';
    >>> }
    >>> }
    >>> s = '\0';

    >
    >> ... I think you have a bug here. The loop above can put max-1 chars
    >> into the buffer (max being what it was originally, of course) but it can
    >> end when c == '\n'. You then try to put the '\n' and the '\0' in the
    >> one remaining space!

    >
    > sorry, this function is just a copy of K&R2 example, as I said earlier.
    > so, something wrong with K&R2 ?


    I'd love to be the one to find a bug in K&R2 after all those eyes have
    looked at it but, no, K&R is correct. There are two getline functions
    in the book (I have K&R first edition so I won't give you page
    numbers). The first tests against i < lim-1 and the second has a test
    for --lim > 0 *before* reading the next character. Your code is
    slightly different and has the bug I described.

    The moral is that you can't just change

    for (i = 0; --max > 0 && (c = getchar()) != EOF && c != '\n'; ++i)

    into

    for (i = 0; (c = getchar()) != EOF && c != '\n' && --max > 0; ++i)

    with no consequences.

    Interestingly, both of K&R's getline functions exhibit undefined
    behaviour when called with a buffer size of 1. This is daft enough
    not to be a bug as such, but I would have preferred to avoid it in a
    teaching text.

    >>> match_num = 0;

    >
    >> OK, maybe you do this for testing, but a utility function like this
    >> should not use a file-scope (AKA "global") variable like match_num. Of
    >> course, you code works if you just make match_num local to this
    >> function.

    >
    > I have used match_num in main() function and that is why I made it global.
    > If it is local then main() is not going to access it.


    That is fine for now, but global variables are not a good way to
    communicate with general-purpose functions like str_index. When you
    have got further though the book, you'll find ways to get more
    information out of a function that means you don't need global
    variables like that.

    --
    Ben.
     
    Ben Bacarisse, Apr 4, 2008
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Kevin Spencer
    Replies:
    2
    Views:
    484
    John Saunders
    Aug 6, 2003
  2. lonelyplanet999

    Exercise needed for java 2 programmer test

    lonelyplanet999, Sep 30, 2003, in forum: Java
    Replies:
    1
    Views:
    4,217
    VisionSet
    Sep 30, 2003
  3. Xah Lee
    Replies:
    12
    Views:
    623
    Duncan Booth
    Jun 22, 2005
  4. Bruce .J Sam
    Replies:
    0
    Views:
    1,982
    Bruce .J Sam
    Jun 16, 2005
  5. Aries
    Replies:
    7
    Views:
    423
    Aries
    May 3, 2006
Loading...

Share This Page