automated test for sub-string search algorithms...

Discussion in 'C Programming' started by Chris M. Thomasson, Feb 20, 2010.

  1. This code includes my sub-string search algorithm, the automated test, and a
    stress test for my algorithm:


    http://clc.pastebin.com/f5cb3bbd2


    This testing framework helped me find several bugs in my algorithm. I have
    fixed them and so far, so good... I will continue to blast the shi% out of
    my algorithm with this test and report any errors here.


    The automated test randomly generates a comparand string, then it builds a
    source string and inserts the comparand at random positions within it. It
    records all the offsets, and the count of matches. Now, you can use this
    information to verify that your search algorithm is getting everything
    correct. Please refer to the `strstr_test_create()' function which generates
    the random data. Then refer to the `strstr_test_xstrstr()' function which
    actually verifies that my algorithm is getting things right.


    AFAICT, it's a fairly decent way to blast your sub-string search algorithm
    with tremendous forms of random data.


    Enjoy!




    BTW, if you have any improvements in mind, please share all of them.

    ;^)
     
    Chris M. Thomasson, Feb 20, 2010
    #1
    1. Advertising

  2. "Chris M. Thomasson" <> wrote in message
    news:BaZfn.2632$...
    > This code includes my sub-string search algorithm, the automated test, and
    > a stress test for my algorithm:
    >
    >
    > http://clc.pastebin.com/f5cb3bbd2

    [...]

    There is a little bug on `line 68':


    printf("source(%u): %s\n",
    (unsigned long)src_size,
    src);


    The first formatter is not compatible with the damn argument! Here, let me
    fix that:


    http://clc.pastebin.com/f32655568


    Sorry about that crap. Anyway, I changed a couple of things. One, I funnel
    all debug messages through a function macro `DBG_PRINTF()' which
    automatically disables output if `NDEBUG' is defined. Two, I made a little
    tweak to the search algorithm itself. It now searches from left-to-right
    after it determines that a rightmost character matches the rightmost
    character from the comparand.


    IMHO, this simple little test program comes in handy when you are trying to
    debug a sub-string search algorithm. I can play around with all sorts of
    improvements and see exactly where each one of them fail. One thing I am
    exploring is reading 32/64-bits worth of characters at a time. This allows
    me reduce the total number of comparisons...
     
    Chris M. Thomasson, Feb 21, 2010
    #2
    1. Advertising

  3. "Chris M. Thomasson" <> wrote in message
    news:BaZfn.2632$...
    > This code includes my sub-string search algorithm, the automated test, and
    > a stress test for my algorithm:
    >
    >
    > http://clc.pastebin.com/f5cb3bbd2
    >
    >
    > This testing framework helped me find several bugs in my algorithm. I have
    > fixed them and so far, so good... I will continue to blast the shi% out of
    > my algorithm with this test and report any errors here.


    Here is latest version of my sub-string search algorithm:


    http://clc.pastebin.com/f5c1d3e5a


    This one allows search to skip many more characters than before. I don't
    think its like Boyer-Moore because I am not using two tables. Humm...


    Any thoughts?
     
    Chris M. Thomasson, Feb 22, 2010
    #3
  4. "Chris M. Thomasson" <> wrote in message
    news:uLugn.5580$...
    > "Chris M. Thomasson" <> wrote in message
    > news:BaZfn.2632$...
    >> This code includes my sub-string search algorithm, the automated test,
    >> and a stress test for my algorithm:
    >>
    >>
    >> http://clc.pastebin.com/f5cb3bbd2
    >>
    >>
    >> This testing framework helped me find several bugs in my algorithm. I
    >> have fixed them and so far, so good... I will continue to blast the shi%
    >> out of my algorithm with this test and report any errors here.

    >
    > Here is latest version of my sub-string search algorithm:
    >
    >
    > http://clc.pastebin.com/f5c1d3e5a
    >
    >
    > This one allows search to skip many more characters than before. I don't
    > think its like Boyer-Moore because I am not using two tables. Humm...
    >
    >
    > Any thoughts?


    I would say it's close to the following algorithm:


    http://en.wikipedia.org/wiki/Boyer-Moore-Horspool_algorithm


    I think I am initializing the hash table a little bit differently. For
    instance, my bad character hash hit is zero, while the Horspool algorithm
    would have the length of the comparand stored in the table.
     
    Chris M. Thomasson, Feb 23, 2010
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Eduardo
    Replies:
    0
    Views:
    420
    Eduardo
    Nov 7, 2005
  2. Ravi Shankar Nair

    Automated Test Data for XML files

    Ravi Shankar Nair, Nov 16, 2005, in forum: Java
    Replies:
    2
    Views:
    696
    Ravi Shankar Nair
    Nov 17, 2005
  3. Ben
    Replies:
    2
    Views:
    937
  4. Chris M. Thomasson

    automated sub-string search/replace algorithm...

    Chris M. Thomasson, Feb 28, 2010, in forum: C Programming
    Replies:
    40
    Views:
    1,199
    Tim Rentsch
    Mar 26, 2010
  5. Lawrence D'Oliveiro

    Death To Sub-Sub-Sub-Directories!

    Lawrence D'Oliveiro, May 5, 2011, in forum: Java
    Replies:
    92
    Views:
    2,128
    Lawrence D'Oliveiro
    May 20, 2011
Loading...

Share This Page