Evaluation of C program

Discussion in 'C Programming' started by morphex@gmail.com, Feb 16, 2006.

  1. Guest

    Hi,

    I'm a python programmer that's started to play a bit with C as I'll
    probably have to make C extensions eventually.. I made this little
    program that I'd like to get feedback on, it's basically a find
    substring and return pointer to it function and tests for it..

    Could this be done better/differently? Is anything fundamentally
    wrong?

    ---

    #include <stdio.h>

    char *find_substring(char *substring, char *string) {
    int index_string, index_substring;
    for (index_string = 0; string[index_string] != 0; index_string++) {
    index_substring = 0;
    do {
    if (substring[index_substring] == 0)
    goto success;
    if (string[index_string + index_substring] !=
    substring[index_substring])
    goto next;
    } while (++index_substring);
    success:
    return &string[index_string];
    next: ;
    }
    return 0;
    }

    int main() {
    if(find_substring("test", "this is a test"))
    printf("Found test in string!\n");

    return 0;
    }

    ---

    Thanks,

    Morten
    , Feb 16, 2006
    #1
    1. Advertising

  2. In article <>,
    <> wrote:
    >I'm a python programmer that's started to play a bit with C as I'll
    >probably have to make C extensions eventually.. I made this little
    >program that I'd like to get feedback on, it's basically a find
    >substring and return pointer to it function and tests for it..


    >Could this be done better/differently? Is anything fundamentally
    >wrong?


    >#include <stdio.h>


    Your entire find_substring() function could be replaced with a
    single call to strstr();

    >char *find_substring(char *substring, char *string) {
    > int index_string, index_substring;
    > for (index_string = 0; string[index_string] != 0; index_string++) {


    strings can be longer than an 'int' can hold. Check out size_t

    > index_substring = 0;
    > do {
    > if (substring[index_substring] == 0)
    > goto success;


    goto should rarely be used. Consider using "break" here.

    > if (string[index_string + index_substring] !=
    >substring[index_substring])
    > goto next;


    goto should rarely be used. Consider using "continue" here.

    > } while (++index_substring);


    The only way that ++index_substring could be false in that test is
    if the int representation overflowed. This suggests that you are using
    the wrong control structure; consider using a for().

    > success:
    > return &string[index_string];
    > next: ;
    > }
    > return 0;


    Returning 0 is not wrong here, but it would be more readable
    if you were to use NULL instead of 0.

    >}


    >
    >int main() {


    int main(void) {

    > if(find_substring("test", "this is a test"))
    > printf("Found test in string!\n");


    If you do not find the substring, then you print nothing, and
    won't know what happened.
    >
    > return 0;


    Returning 0 indicates success, but it could be argued that if you
    do not find the substring then your program has failed and so
    returning EXIT_FAILURE (from <stdlib.h>) would be more appropriate
    in that instance.

    >}



    >Could this be done better/differently?


    Suppose you don't find a match the first iteration through
    because the 10th character in the trial substring does not match
    the 10th character in the input string. You then loop back
    and start testing from the substring again from the second
    character of the input string -- ignoring all the information
    you could have gleened by paying attention to what was in
    offsets 1 thru 9 that you have already looked at.

    Suppose for example that what you found at the 10th character at the
    input string was an 'X', and there are no occurances of 'X' in the
    trial substring. A moment's reflection would show you that in
    such an instance you would be able to skip testing for the
    substring as starting from 1 thru 9, since that 'X' will still be
    there at offset 9 blocking any possible match.

    What this tells you is that there are algorithms which can be
    much faster than your search algorithm. Indeed, there is an entire
    literature on search algorithms. Some are probably mentioned in the C
    FAQ or by googling for "string search algorithms".
    --
    If you lie to the compiler, it will get its revenge. -- Henry Spencer
    Walter Roberson, Feb 16, 2006
    #2
    1. Advertising

  3. >> #include <stdio.h>
    >
    > Your entire find_substring() function could be replaced with a
    > single call to strstr();


    Yep. This was an exercise so I didn't look, but good to know.

    >> char *find_substring(char *substring, char *string) {
    >> int index_string, index_substring;
    >> for (index_string = 0; string[index_string] != 0; index_string++) {

    >
    > strings can be longer than an 'int' can hold. Check out size_t


    Good point.

    >> index_substring = 0;
    >> do {
    >> if (substring[index_substring] == 0)
    >> goto success;

    >
    > goto should rarely be used. Consider using "break" here.
    >
    >> if (string[index_string + index_substring] !=
    >> substring[index_substring])
    >> goto next;

    >
    > goto should rarely be used. Consider using "continue" here.


    I don't see the harm in using goto.. it does the job. :)

    >> } while (++index_substring);

    >
    > The only way that ++index_substring could be false in that test is
    > if the int representation overflowed. This suggests that you are using
    > the wrong control structure; consider using a for().


    Well, it will always goto using one of the two statements above it, so
    it works. But it should be a long, yes.

    >> success:
    >> return &string[index_string];
    >> next: ;
    >> }
    >> return 0;

    >
    > Returning 0 is not wrong here, but it would be more readable
    > if you were to use NULL instead of 0.


    Yep.

    >> }

    >
    >> int main() {

    >
    > int main(void) {
    >
    >> if(find_substring("test", "this is a test"))
    >> printf("Found test in string!\n");

    >
    > If you do not find the substring, then you print nothing, and
    > won't know what happened.


    Yeah..

    >> return 0;

    >
    > Returning 0 indicates success, but it could be argued that if you
    > do not find the substring then your program has failed and so
    > returning EXIT_FAILURE (from <stdlib.h>) would be more appropriate
    > in that instance.
    >
    >> }

    >
    >
    >> Could this be done better/differently?

    >
    > Suppose you don't find a match the first iteration through
    > because the 10th character in the trial substring does not match
    > the 10th character in the input string. You then loop back
    > and start testing from the substring again from the second
    > character of the input string -- ignoring all the information
    > you could have gleened by paying attention to what was in
    > offsets 1 thru 9 that you have already looked at.
    >
    > Suppose for example that what you found at the 10th character at the
    > input string was an 'X', and there are no occurances of 'X' in the
    > trial substring. A moment's reflection would show you that in
    > such an instance you would be able to skip testing for the
    > substring as starting from 1 thru 9, since that 'X' will still be
    > there at offset 9 blocking any possible match.
    >
    > What this tells you is that there are algorithms which can be
    > much faster than your search algorithm. Indeed, there is an entire
    > literature on search algorithms. Some are probably mentioned in the C
    > FAQ or by googling for "string search algorithms".


    OK. This was a bit beyond what I was looking to figure out, but good to
    know I guess.

    Thanks. :)

    -Morten
    Morten W. Petersen, Feb 16, 2006
    #3
  4. "Walter Roberson" <-cnrc.gc.ca> wrote in message
    news:dt2hs2$hbg$...
    > In article <>,
    > <> wrote:
    >>I'm a python programmer that's started to play a bit with C as I'll
    >>probably have to make C extensions eventually.. I made this little
    >>program that I'd like to get feedback on, it's basically a find
    >>substring and return pointer to it function and tests for it..

    >
    >>Could this be done better/differently? Is anything fundamentally
    >>wrong?

    >
    >>#include <stdio.h>

    >
    > Your entire find_substring() function could be replaced with a
    > single call to strstr();
    >
    >>char *find_substring(char *substring, char *string) {
    >> int index_string, index_substring;
    >> for (index_string = 0; string[index_string] != 0; index_string++) {

    >
    > strings can be longer than an 'int' can hold. Check out size_t
    >
    >> index_substring = 0;
    >> do {
    >> if (substring[index_substring] == 0)
    >> goto success;

    >
    > goto should rarely be used. Consider using "break" here.
    >
    >> if (string[index_string + index_substring] !=
    >>substring[index_substring])
    >> goto next;

    >
    > goto should rarely be used. Consider using "continue" here.
    >
    >> } while (++index_substring);

    >
    > The only way that ++index_substring could be false in that test is
    > if the int representation overflowed. This suggests that you are using
    > the wrong control structure; consider using a for().
    >
    >> success:
    >> return &string[index_string];
    >> next: ;
    >> }
    >> return 0;

    >
    > Returning 0 is not wrong here, but it would be more readable
    > if you were to use NULL instead of 0.
    >
    >>}

    >
    >>
    >>int main() {

    >
    > int main(void) {
    >
    >> if(find_substring("test", "this is a test"))
    >> printf("Found test in string!\n");

    >
    > If you do not find the substring, then you print nothing, and
    > won't know what happened.
    >>
    >> return 0;

    >
    > Returning 0 indicates success, but it could be argued that if you
    > do not find the substring then your program has failed and so
    > returning EXIT_FAILURE (from <stdlib.h>) would be more appropriate
    > in that instance.
    >
    >>}

    >
    >
    >>Could this be done better/differently?

    >
    > Suppose you don't find a match the first iteration through
    > because the 10th character in the trial substring does not match
    > the 10th character in the input string. You then loop back
    > and start testing from the substring again from the second
    > character of the input string -- ignoring all the information
    > you could have gleened by paying attention to what was in
    > offsets 1 thru 9 that you have already looked at.
    >
    > Suppose for example that what you found at the 10th character at the
    > input string was an 'X', and there are no occurances of 'X' in the
    > trial substring. A moment's reflection would show you that in
    > such an instance you would be able to skip testing for the
    > substring as starting from 1 thru 9, since that 'X' will still be
    > there at offset 9 blocking any possible match.
    >

    Not True!

    Suppose string = "XXXXXXXXXXY"
    and substring = "XXXXXXXXXY"

    They do not match beginning from index 0, since 'X' != 'Y'
    But you cannot skip to index 9, since they DO match statring at index 1.

    > What this tells you is that there are algorithms which can be
    > much faster than your search algorithm. Indeed, there is an entire
    > literature on search algorithms. Some are probably mentioned in the C
    > FAQ or by googling for "string search algorithms".
    > --
    > If you lie to the compiler, it will get its revenge. -- Henry Spencer


    --
    Fred L. Kleinschmidt
    Boeing Associate Technical Fellow
    Technical Architect, Software Reuse Project
    Fred Kleinschmidt, Feb 16, 2006
    #4
  5. In article <>,
    Fred Kleinschmidt <> wrote:

    >"Walter Roberson" <-cnrc.gc.ca> wrote in message
    >news:dt2hs2$hbg$...


    >> Suppose for example that what you found at the 10th character at the
    >> input string was an 'X', and there are no occurances of 'X' in the
    >> trial substring. A moment's reflection would show you that in
    >> such an instance you would be able to skip testing for the
    >> substring as starting from 1 thru 9, since that 'X' will still be
    >> there at offset 9 blocking any possible match.


    > Not True!


    >Suppose string = "XXXXXXXXXXY"
    >and substring = "XXXXXXXXXY"


    >They do not match beginning from index 0, since 'X' != 'Y'
    >But you cannot skip to index 9, since they DO match statring at index 1.


    But that violates the proposition "and there are no occurances
    of 'X' in the trial substring".
    --
    I was very young in those days, but I was also rather dim.
    -- Christopher Priest
    Walter Roberson, Feb 17, 2006
    #5
  6. wrote:
    > I'm a python programmer that's started to play a bit with C as I'll
    > probably have to make C extensions eventually.. I made this little
    > program that I'd like to get feedback on, it's basically a find
    > substring and return pointer to it function and tests for it..
    >
    > Could this be done better/differently? Is anything fundamentally
    > wrong?
    >
    > ---
    >
    > #include <stdio.h>
    >
    > char *find_substring(char *substring, char *string) {
    > int index_string, index_substring;
    > for (index_string = 0; string[index_string] != 0; index_string++) {
    > index_substring = 0;
    > do {
    > if (substring[index_substring] == 0)
    > goto success;
    > if (string[index_string + index_substring] !=
    > substring[index_substring])
    > goto next;
    > } while (++index_substring);
    > success:
    > return &string[index_string];
    > next: ;
    > }
    > return 0;
    > }
    >
    > int main() {
    > if(find_substring("test", "this is a test"))
    > printf("Found test in string!\n");
    >
    > return 0;
    > }


    Here is how I would do it.


    --- Source Text ---

    #include <stdio.h>
    #include <string.h>


    /* Returns the starting index of pattern in s or -1 if not found. */

    int position(const char *pattern, const char *s)
    {
    int j, k, plen, slen, res;

    plen = strlen(pattern);
    slen = strlen(s);
    res = -1;
    j = 0;
    while ((res < 0) && (j + plen < slen)) {
    k = 0;
    while ((k < plen) && (pattern[k] == s[j + k])) { k++; }
    if (k == plen) { res = j; }
    j++;
    }
    return res;
    }


    int main(void)
    {
    char s[] = "Hello there!";
    char pattern[] = "there";
    int pos;

    pos = position(pattern, s);
    if (pos < 0) {
    printf("\"%s\" does not contain \"%s\".\n", s, pattern);
    } else {
    printf("\"%s\" contains \"%s\" starting at index %d.\n",
    s, pattern, pos);
    }
    return 0;
    }

    --- End Of Source Text ---


    August

    --
    I am the "ILOVEGNU" signature virus. Just copy me to your
    signature. This email was infected under the terms of the GNU
    General Public License.
    August Karlstrom, Feb 17, 2006
    #6
  7. > Here is how I would do it.
    >
    >
    > --- Source Text ---
    >
    > #include <stdio.h>
    > #include <string.h>
    >
    >
    > /* Returns the starting index of pattern in s or -1 if not found. */
    >
    > int position(const char *pattern, const char *s)
    > {
    > int j, k, plen, slen, res;
    >
    > plen = strlen(pattern);
    > slen = strlen(s);
    > res = -1;
    > j = 0;
    > while ((res < 0) && (j + plen < slen)) {
    > k = 0;
    > while ((k < plen) && (pattern[k] == s[j + k])) { k++; }
    > if (k == plen) { res = j; }
    > j++;
    > }
    > return res;
    > }


    This was a nice example. Thanks for posting it. :)

    -Morten
    Morten W. Petersen, Feb 17, 2006
    #7
  8. Guest

    Walter Roberson wrote:
    > In article <>,
    > <> wrote:
    > >I'm a python programmer that's started to play a bit with C as I'll
    > >probably have to make C extensions eventually.. I made this little
    > >program that I'd like to get feedback on, it's basically a find
    > >substring and return pointer to it function and tests for it..

    >
    > >Could this be done better/differently? Is anything fundamentally
    > >wrong?


    > Suppose you don't find a match the first iteration through
    > because the 10th character in the trial substring does not match
    > the 10th character in the input string. You then loop back
    > and start testing from the substring again from the second
    > character of the input string -- ignoring all the information
    > you could have gleened by paying attention to what was in
    > offsets 1 thru 9 that you have already looked at.
    >
    > Suppose for example that what you found at the 10th character at the
    > input string was an 'X', and there are no occurances of 'X' in the
    > trial substring. A moment's reflection would show you that in
    > such an instance you would be able to skip testing for the
    > substring as starting from 1 thru 9, since that 'X' will still be
    > there at offset 9 blocking any possible match.
    >
    > What this tells you is that there are algorithms which can be
    > much faster than your search algorithm. Indeed, there is an entire
    > literature on search algorithms. Some are probably mentioned in the C
    > FAQ or by googling for "string search algorithms".


    Boyer Horspool Moore comes to mind. I implemented it with
    a circular buffer; it's fast.

    Stijn
    , Feb 17, 2006
    #8
  9. CBFalconer Guest

    wrote:
    > Walter Roberson wrote:
    >> <> wrote:

    >
    >>> I'm a python programmer that's started to play a bit with C as
    >>> I'll probably have to make C extensions eventually.. I made
    >>> this little program that I'd like to get feedback on, it's
    >>> basically a find substring and return pointer to it function
    >>> and tests for it..

    >>
    >>> Could this be done better/differently? Is anything fundamentally
    >>> wrong?

    >
    >> Suppose you don't find a match the first iteration through
    >> because the 10th character in the trial substring does not match
    >> the 10th character in the input string. You then loop back
    >> and start testing from the substring again from the second
    >> character of the input string -- ignoring all the information
    >> you could have gleened by paying attention to what was in
    >> offsets 1 thru 9 that you have already looked at.
    >>
    >> Suppose for example that what you found at the 10th character at the
    >> input string was an 'X', and there are no occurances of 'X' in the
    >> trial substring. A moment's reflection would show you that in
    >> such an instance you would be able to skip testing for the
    >> substring as starting from 1 thru 9, since that 'X' will still be
    >> there at offset 9 blocking any possible match.
    >>
    >> What this tells you is that there are algorithms which can be
    >> much faster than your search algorithm. Indeed, there is an entire
    >> literature on search algorithms. Some are probably mentioned in the C
    >> FAQ or by googling for "string search algorithms".

    >
    > Boyer Horspool Moore comes to mind. I implemented it with
    > a circular buffer; it's fast.


    Knuth-Morris-Pratt has the distinct advantage of allowing operation
    on streams and totally avoiding any need for look ahead or look
    back. The following is from a post I made here almost two years
    ago, and illustrates the use of KMP on a file stream.

    /*
    Leor Zolman wrote:
    > On 25 Feb 2004 07:34:40 -0800, (spike) wrote:
    >
    >> Im trying to write a program that should read through a binary
    >> file searching for the character sequence "\name\"
    >>
    >> Then it should read the characters following the "\name\"
    >> sequence until a NULL character is encountered.
    >>
    >> But when my program runs it gets a SIGSEGV (Segmentation
    >> vioalation) signal.
    >>
    >> Whats wrong? And is there a better way than mine to solve
    >>this task (most likely)

    >
    > I think so. Here's a version I just threw together:

    */

    /* And heres another throw -- binfsrch.c by CBF */
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <ctype.h>
    #include <assert.h>

    /* The difference between a binary and a text file, on read,
    is the conversion of end-of-line delimiters. What those
    delimiters are does not affect the action. In some cases
    the presence of 0x1a EOF markers (MsDos) does.

    This is a version of Knuth-Morris-Pratt algorithm. The
    point of using this is to avoid any backtracking in file
    reading, and thus avoiding any use of buffer arrays.
    */

    size_t chrcount; /* debuggery, count of input chars, zeroed */

    /* --------------------- */

    /* Almost straight out of Sedgewick */
    /* The next array indicates what index in id should next be
    compared to the current char. Once the (lgh - 1)th char
    has been successfully compared, the id has been found.
    The array is formed by comparing id to itself. */
    void initnext(int *next, const char *id, int lgh)
    {
    int i, j;

    assert(lgh > 0);
    next[0] = -1; i = 0; j = -1;
    while (i < lgh) {
    while ((j >= 0) && (id != id[j])) j = next[j];
    i++; j++;
    next = j;
    }
    #if (0)
    for (i = 0; i < lgh; i++)
    printf("id[%d] = '%c' next[%d] = %d\n",
    i, id, i, next);
    #endif
    } /* initnext */

    /* --------------------- */

    /* reads f without rewinding until either EOF or *marker
    has been found. Returns EOF if not found. At exit the
    last matching char has been read, and no further. */
    int kmpffind(const char *marker, int lgh, int *next, FILE *f)
    {
    int j; /* char position in marker to check */
    int ch; /* current char */

    assert(lgh > 0);
    j = 0;
    while ((j < lgh) && (EOF != (ch = getc(f)))) {
    chrcount++;
    while ((j >= 0) && (ch != marker[j])) j = next[j];
    j++;
    }
    return ch;
    } /* kmpffind */

    /* --------------------- */

    /* Find marker in f, display following printing chars
    up to some non printing character or EOF */
    int binfsrch(const char *marker, FILE *f)
    {
    int *next;
    int lgh;
    int ch;
    int items; /* count of markers found */

    lgh = strlen(marker);
    if (!(next = malloc(lgh * sizeof *next))) {
    puts("No memory");
    exit(EXIT_FAILURE);
    }
    else {
    initnext(next, marker, lgh);
    items = 0;
    while (EOF != kmpffind(marker, lgh, next, f)) {
    /* found, take appropriate action */
    items++;
    printf("%d %s : \"", items, marker);
    while (isprint(ch = getc(f))) {
    chrcount++;
    putchar(ch);
    }
    puts("\"");
    if (EOF == ch) break;
    else chrcount++;
    }
    free(next);
    return items;
    }
    } /* binfsrch */

    /* --------------------- */

    int main(int argc, char **argv)
    {
    FILE *f;

    f = stdin;
    if (3 == argc) {
    if (!(f = fopen(argv[2], "rb"))) {
    printf("Can't open %s\n", argv[2]);
    exit(EXIT_FAILURE);
    }
    argc--;
    }
    if (2 != argc) {
    puts("Usage: binfsrch name [binaryfile]");
    puts(" (file defaults to stdin text mode)");
    }
    else if (binfsrch(argv[1], f)) {
    printf("\"%s\" : found\n", argv[1]);
    }
    else printf("\"%s\" : not found\n", argv[1]);
    printf("%lu chars\n", (unsigned long)chrcount);
    return 0;
    } /* main binfsrch */

    --
    "If you want to post a followup via groups.google.com, don't use
    the broken "Reply" link at the bottom of the article. Click on
    "show options" at the top of the article, then click on the
    "Reply" at the bottom of the article headers." - Keith Thompson
    More details at: <http://cfaj.freeshell.org/google/>
    Also see <http://www.safalra.com/special/googlegroupsreply/>
    CBFalconer, Feb 17, 2006
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Yoav Brav

    Expression Evaluation DotNet

    Yoav Brav, Jul 25, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    488
    Daniel Bass
    Jul 25, 2003
  2. Pieter Provoost

    first program evaluation

    Pieter Provoost, May 31, 2004, in forum: C++
    Replies:
    30
    Views:
    852
    Daniel T.
    Jun 5, 2004
  3. Ilias Lazaridis
    Replies:
    2
    Views:
    381
    Ilias Lazaridis
    Apr 24, 2005
  4. Ilias Lazaridis
    Replies:
    74
    Views:
    722
    Ilias Lazaridis
    Apr 4, 2005
  5. Ilias Lazaridis
    Replies:
    18
    Views:
    321
    Bill Guindon
    Apr 9, 2005
Loading...

Share This Page