opensolaris C library vs. GNU C Library

Discussion in 'C Programming' started by smnoff, Jun 8, 2006.

  1. smnoff

    smnoff Guest

    I have taken a look at the source code for a few string functions like
    memcpy() and strstr() in OpenSolaris and GNU and I find them somewhat
    different.

    The GNU version of strstr() seems like a mess and very cluttered. Not sure
    how fast it is but the OpenSolaris version of strstr() seems
    straightforward.

    Any reason why this is?

    Also any comments, recommendation, stuff a newbie should know between the
    two libraries,
    OpenSolaris and GNU?


    Open Solaris
    http://cvs.opensolaris.org/source/xref/on/usr/src/common/util/string.c

    Free Software Foundation (GNU)
    http://wdiff.progiciels-bpi.ca/showfile.html?name=dist/lib/strstr.c
    smnoff, Jun 8, 2006
    #1
    1. Advertising

  2. smnoff

    tedu Guest

    smnoff wrote:
    > The GNU version of strstr() seems like a mess and very cluttered. Not sure
    > how fast it is but the OpenSolaris version of strstr() seems
    > straightforward.
    >
    > Any reason why this is?


    * I deliberately chose not to comment it. You should have at least
    * as much fun trying to understand it, as I had to write it :).

    i think that pretty much says it all.

    > Also any comments, recommendation, stuff a newbie should know between the
    > two libraries,
    > OpenSolaris and GNU?


    i'd recommend against adopting a similar philosophy when writing your
    own code.
    tedu, Jun 8, 2006
    #2
    1. Advertising

  3. smnoff

    jacob navia Guest

    tedu a écrit :
    > smnoff wrote:
    >
    >>The GNU version of strstr() seems like a mess and very cluttered. Not sure
    >>how fast it is but the OpenSolaris version of strstr() seems
    >>straightforward.
    >>
    >>Any reason why this is?

    >
    >
    > * I deliberately chose not to comment it. You should have at least
    > * as much fun trying to understand it, as I had to write it :).
    >
    > i think that pretty much says it all.



    Yes. You are right. Most of the GNU code is like that.

    Messy, and without a single comment, beides the license
    stuff.

    Open source means nothing at all since to understand
    the stuff you have to spend more time than if you start
    from scratch.

    I wrote this:

    extern int _stdcall GetTickCount(void);

    int main(void)
    {
    char buf[65535];
    char *p="Search";
    int t,i;

    memset(buf,'A',sizeof(buf)-1);
    buf[sizeof(buf)-1] = 0;
    strcpy(buf+sizeof(buf)-sizeof("Search")-1,p);
    t = GetTickCount() ;
    for (i=0; i<10000;i++) {
    GNUstrstr(buf,"Search");
    }
    t = GetTickCount() - t;
    printf("GNU=%d ms\n",t);
    t = GetTickCount() ;
    for (i=0; i<10000;i++) {
    Sunstrstr(buf,"Seach");
    }
    t = GetTickCount() - t;
    printf("Sun = %d ms\n",t);
    }

    The output is:
    GNU=1313 ms
    Sun = 1640 ms

    If we change the strcpy line to more realistic test
    conditions:
    memcpy(buf+1000,p,6);

    results change to:

    GNU=16 ms
    Sun = 31 ms

    This means that both algorithms converge with large strings, what is not
    surprising since input/output from/to main memory becomes the dominant
    factor of the computation, and both algorithms are then the same.

    With shorter strings GNU is twice as fast, but since the strings are
    shorter the absolute difference is less (only 15ms after 10 thousand
    runs!) so it is probably not worth the effort of understanding
    the GNU version, full of goto's and a very complex control flow.

    Surely I would not love being the maintainer of that stuff.

    The size of the GNU function is 260 bytes, the Sun version is
    only 110 bytes.

    The GNU version would be important when strstr makes more than 90%
    of the run time of your program. For all other situations Sun's version
    is the same

    jacob
    jacob navia, Jun 8, 2006
    #3
  4. smnoff

    jacob navia Guest

    jacob navia a écrit :
    >
    > I wrote this:
    >
    > extern int _stdcall GetTickCount(void);
    >
    > int main(void)
    > {
    > char buf[65535];
    > char *p="Search";
    > int t,i;
    >
    > memset(buf,'A',sizeof(buf)-1);
    > buf[sizeof(buf)-1] = 0;

    // Prepare a string of 64K having the searched
    // for string at the end of the string, i.e.
    // almost at the end of the 64K
    > strcpy(buf+sizeof(buf)-sizeof("Search")-1,p);
    > t = GetTickCount() ;
    > for (i=0; i<10000;i++) {
    > GNUstrstr(buf,"Search");
    > }
    > t = GetTickCount() - t;
    > printf("GNU=%d ms\n",t);
    > t = GetTickCount() ;
    > for (i=0; i<10000;i++) {
    > Sunstrstr(buf,"Seach");
    > }
    > t = GetTickCount() - t;
    > printf("Sun = %d ms\n",t);
    > }



    Excuse me. I was complaining about lack of comments in GNU stuff
    and wrote a piece of code without any comment !!!!!

    I hope is clearer now.

    jacob
    jacob navia, Jun 8, 2006
    #4
  5. On Fri, 09 Jun 2006 00:16:34 +0200, jacob navia wrote:
    > The GNU version would be important when strstr makes more than 90% of the
    > run time of your program. For all other situations Sun's version is the
    > same
    >


    Except, with a poor library every time you re-implement and optimize one
    function, another one will rise to the top. I'd rather a well honed
    library where I didn't have to worry about such things and could focus on
    my own code. So, if this difference were isolated you have a point, if not
    shame on Sun.

    Practically speaking, you want simple code and simple algorithms on the
    edges of your software, where you spend the vast majority of your time
    changing behavior.
    William Ahern, Jun 9, 2006
    #5
  6. On Fri, 09 Jun 2006 00:16:34 +0200, jacob navia
    <> wrote:
    <snip: compare GNU vs Solaris ststr>
    > char buf[65535];
    > char *p="Search";
    > int t,i;
    >
    > memset(buf,'A',sizeof(buf)-1);
    > buf[sizeof(buf)-1] = 0;
    > strcpy(buf+sizeof(buf)-sizeof("Search")-1,p);


    If you made it char p [] = "Search" could use sizeof (p).

    > t = GetTickCount() ;
    > for (i=0; i<10000;i++) {
    > GNUstrstr(buf,"Search");
    > }
    > t = GetTickCount() - t;
    > printf("GNU=%d ms\n",t);
    > t = GetTickCount() ;
    > for (i=0; i<10000;i++) {
    > Sunstrstr(buf,"Seach");
    > }
    > t = GetTickCount() - t;
    > printf("Sun = %d ms\n",t);


    I hope you actually searched for Search not Seach in the Solaris case
    to be (completely) fair. Or to avoid such typos, use p in both cases.


    - David.Thompson1 at worldnet.att.net
    Dave Thompson, Jun 19, 2006
    #6
  7. smnoff

    Guest

    jacob navia wrote:
    > This means that both algorithms converge with large strings, what is not
    > surprising since input/output from/to main memory becomes the dominant
    > factor of the computation, and both algorithms are then the same.


    This will probably not be true until you start swapping to disk.
    Memory is still O(n), but is mitigated by bandwidth capabilities of
    your hardware while the O(n) algorithm of the strstr() itself is still
    dependent on the efficiency of the algorithm.

    > With shorter strings GNU is twice as fast, but since the strings are
    > shorter the absolute difference is less (only 15ms after 10 thousand
    > runs!) so it is probably not worth the effort of understanding
    > the GNU version, full of goto's and a very complex control flow.


    Because you're going to get so much more understanding out of "you
    can't shift a negative number on some platforms". It *IS* worth a
    minute or two of study, because the author didn't comment it, yet it
    achieves better performance. If you figure it out, you may be able to
    achieve that performance without the non-maintainability penality. You
    instead prefer to stay in the dark for lack of wanting to look at a 100
    lines of code?

    > Surely I would not love being the maintainer of that stuff.


    In the case of strstr(), or other well defined library functions, its
    actually quite easy to maintain it, regardless of how convoluted it is.
    Just write tests for it.

    > The size of the GNU function is 260 bytes, the Sun version is
    > only 110 bytes.
    >
    > The GNU version would be important when strstr makes more than 90%
    > of the run time of your program. For all other situations Sun's version
    > is the same


    Yes, but *understanding* the code (for its performance) is the most
    important alternative of all. Doing so allows you do this:

    char * QEDstrstr (const char * d0, const char * d1) {
    int i, j;
    char c0, c1;

    /* peel off case: strstr (*, "") */
    if ('\0' == (c0 = d1[0])) return (char *) d0;

    /* peel off case: strstr (*, one-character-string) */
    if ('\0' == (c1 = d1[1])) {
    char * d = (char *) d0;
    for (; c0 != *d; d++) {
    if ('\0' == *d) return NULL;
    }
    return d;
    }

    for (i = 0;; i++) {

    /* Unrolled (once) first character test */
    loop0:;
    if (c0 != d0) {
    if ('\0' == d0) return NULL;
    if (c0 != d0[i+1]) {
    if ('\0' == d0[i+1]) return NULL;
    i+=2;
    goto loop0;
    }
    i++;
    }

    /* Second character test */
    if (c1 != d0[i+1]) continue;

    /* Check beyond first two characters */
    for (j = 2; d1[j] == d0[i+j]; j++) {
    if ('\0' == d1[j]) return (char *) (d0 + i);
    }
    if ('\0' == d1[j]) return (char *) (d0 + i);
    }
    }

    This achieves very close to the GNUstrstr performance (from my own
    testing, on multiple compilers), while being somewhat more readable.
    The trick, of course, was to unroll (the core reason why the GNU
    version is faster), and peel off one case (which is just a standard
    thing I do, as necessary). Its commented, and therefore even more
    maintainable than the Sun version.

    Now, would you say, this sort of optimization was not worth it, because
    I cannot possibly optimize for multiple platforms/compilers at once
    (even though I did just that)? That the micro-technique (unrolling) is
    not worth it because you're hoping that your compiler will do it for
    you (only one I tried succeeded in doing that)?

    --
    Paul Hsieh
    http://www.pobox.com/~qed/
    http://bstring.sf.net/
    , Jun 19, 2006
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Peter

    1 day gnu, whole life gnu?

    Peter, Jan 10, 2005, in forum: Java
    Replies:
    3
    Views:
    326
    John C. Bollinger
    Jan 10, 2005
  2. Peter
    Replies:
    17
    Views:
    594
    Chris Smith
    Jan 13, 2005
  3. Servertec
    Replies:
    0
    Views:
    360
    Servertec
    Sep 5, 2005
  4. Markus Elfring
    Replies:
    2
    Views:
    358
    Markus Elfring
    Feb 23, 2005
  5. Eric Wong
    Replies:
    0
    Views:
    462
    Eric Wong
    Jun 14, 2011
Loading...

Share This Page