strcmp() vs. std::string::operator==

Discussion in 'C++' started by jl_post@hotmail.com, Oct 4, 2005.

  1. Guest

    Hi,

    I recently wrote two benchmark programs that compared if two strings
    were equal: one was a C program that used C char arrays with strcmp(),
    and the other was a C++ program that used std::strings with
    operator==().

    In both programs, the first string consisted of one million
    characters (all the letter 'a'). The second string was always one
    character longer than the first string (with the letter 'a' for all the
    characters).

    In the first program (which was written in C), the comparison was
    done like this:

    int same = strcmp(string1, string2) ? 0 : 1;

    In the second program (which was written in C++), the comparison was
    done like this:

    bool same = (string1 == string2);

    These comparisons were performed in a loop that looped a large number
    of times. Then the programs were timed to see which ran faster.

    I expected the C++ code to run much, much faster than the C code,
    since I would think that the std::string::eek:perator==() method would
    first check to see if string1.length() == string2.length() and return
    false (since two strings of unequal length cannot be equivalent). This
    should run much faster than C's strcmp() function, which has to run
    until it finds the first unequal character (which, in this case, will
    be the one-million-and-one-th character).

    However, the results surprised me. The C++ code did indeed run
    faster, but not by much. Apparently it's not checking the length of
    the strings in the operator==() method.

    In fact, by experimenting with different lengths of the strings, I
    found that the C code (using strcmp()) would narrowly beat out the C++
    code (using std::string::eek:perator==()) if the string sizes were 1000
    characters or less, but the C++ code would consistently run faster (but
    not my much) if the string sizes were 10,000 characters or longer.

    Please correct me if I'm wrong, but wouldn't it make more sense for
    the std::string::eek:perator==() method to check the sizes of the strings
    before it proceeds to compare every character? I would think that it
    would save a lot of processor time when comparing text from large files
    with identical headers.

    If you think I'm missing something obvious by making this argument,
    please don't hesistate to educate me.

    Thanks for any input.

    -- Jean-Luc
     
    , Oct 4, 2005
    #1
    1. Advertising

  2. mlimber Guest

    wrote:
    > Hi,
    >
    > I recently wrote two benchmark programs that compared if two strings
    > were equal: one was a C program that used C char arrays with strcmp(),
    > and the other was a C++ program that used std::strings with
    > operator==().
    >
    > In both programs, the first string consisted of one million
    > characters (all the letter 'a'). The second string was always one
    > character longer than the first string (with the letter 'a' for all the
    > characters).
    >
    > In the first program (which was written in C), the comparison was
    > done like this:
    >
    > int same = strcmp(string1, string2) ? 0 : 1;
    >
    > In the second program (which was written in C++), the comparison was
    > done like this:
    >
    > bool same = (string1 == string2);
    >
    > These comparisons were performed in a loop that looped a large number
    > of times. Then the programs were timed to see which ran faster.
    >
    > I expected the C++ code to run much, much faster than the C code,
    > since I would think that the std::string::eek:perator==() method would
    > first check to see if string1.length() == string2.length() and return
    > false (since two strings of unequal length cannot be equivalent). This
    > should run much faster than C's strcmp() function, which has to run
    > until it finds the first unequal character (which, in this case, will
    > be the one-million-and-one-th character).
    >
    > However, the results surprised me. The C++ code did indeed run
    > faster, but not by much. Apparently it's not checking the length of
    > the strings in the operator==() method.
    >
    > In fact, by experimenting with different lengths of the strings, I
    > found that the C code (using strcmp()) would narrowly beat out the C++
    > code (using std::string::eek:perator==()) if the string sizes were 1000
    > characters or less, but the C++ code would consistently run faster (but
    > not my much) if the string sizes were 10,000 characters or longer.
    >
    > Please correct me if I'm wrong, but wouldn't it make more sense for
    > the std::string::eek:perator==() method to check the sizes of the strings
    > before it proceeds to compare every character? I would think that it
    > would save a lot of processor time when comparing text from large files
    > with identical headers.
    >
    > If you think I'm missing something obvious by making this argument,
    > please don't hesistate to educate me.
    >
    > Thanks for any input.
    >
    > -- Jean-Luc


    I think this is dependent on your standard library implementation. I
    see that STL-port does length checking first.

    Cheers! --M
     
    mlimber, Oct 4, 2005
    #2
    1. Advertising

  3. wrote:

    > If you think I'm missing something obvious by making this argument,
    > please don't hesistate to educate me.


    You are missing to post the code of that benchmark.

    --
    Salu2
     
    =?ISO-8859-15?Q?Juli=E1n?= Albo, Oct 4, 2005
    #3
  4. Guest

    mlimber wrote:
    >
    > I think this is dependent on your standard
    > library implementation. I see that STL-port
    > does length checking first.



    Hey, thanks! I went to http://www.stlport.org/ and looked around at
    the source code. Sure enough, the operator==() method (in a file named
    "_string.h" of STLport) does look like it checks the length before
    comparing:

    {
    return __x.size() == __y.size()
    && _Traits::compare(__x.data(),
    __y.data(),
    __x.size()) == 0;
    }

    Of course, I still wonder why other implementations don't do the
    same. I would think that the cost of comparing the size of the strings
    is negligible, so I don't see any reason why it shouldn't be done.

    Thanks again.

    -- Jean-Luc
     
    , Oct 4, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matthias =?ISO-8859-1?Q?K=E4ppler?=

    std::string::push_back vs. std::string::operator+=

    Matthias =?ISO-8859-1?Q?K=E4ppler?=, Nov 22, 2004, in forum: C++
    Replies:
    2
    Views:
    4,150
    Jonathan Mcdougall
    Nov 23, 2004
  2. Geoffrey S. Knauth
    Replies:
    6
    Views:
    1,003
    Earl Purple
    Jan 18, 2006
  3. Replies:
    10
    Views:
    793
  4. Martin T.
    Replies:
    7
    Views:
    820
    Martin T.
    Mar 10, 2008
  5. curiousEngine
    Replies:
    1
    Views:
    1,366
    James Kanze
    May 9, 2008
Loading...

Share This Page