matching assembly speed for small string comparison

Discussion in 'C Programming' started by qak, Aug 31, 2013.

  1. qak

    qak Guest

    Equivalent to assemply:
    cmp [esi], 'Test'
    In C:
    // slow and long
    if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
    // have to look up Ascii table, hard to change
    if (*(int*)p == 0x54657374)...
    Please have a better solution, thanks.
     
    qak, Aug 31, 2013
    #1
    1. Advertisements

  2. qak

    Eric Sosman Guest

    #include <string.h>
    char *p = "Testosterone";
    if (memcmp(p, "Test", 4) == 0) ...

    (At first I wrote a strcmp() call, then realized you'd asked
    for something different.)
     
    Eric Sosman, Aug 31, 2013
    #2
    1. Advertisements

  3. I'd second memcmp. Try it and measure the speed (not of the test --
    that will be quite hard and not very informative -- but of your
    program). A lot of maintainable C code has been sacrificed on the alter
    of efficiency.

    If you still want to avoid the standard idiom (memcmp) then you can
    avoid the gruesome constant simply by using a string literal:

    if (*(int32_t *)p == *(int32_t *)"Test") ...

    Using int32_t makes the intent a little clearer.
     
    Ben Bacarisse, Aug 31, 2013
    #3
  4. On many systems, int requires alignment (frequently a multiple of 4).
    String literals and character arrays do not have a similar
    requirement. Any attempt to convert a char* to an int* when the value
    of the char* does not meet the alignment for an int* will invoke
    undefined behavior.
     
    Barry Schwarz, Aug 31, 2013
    #4
  5. How do you insure that the value of p and the address of the string
    literal are suitably aligned for int. If either is not, the statement
    invokes undefined behavior.
     
    Barry Schwarz, Aug 31, 2013
    #5
  6. qak

    James Kuyper Guest

    By using _Alignas(int32_t), which is new in C2011. This will require
    using the string literal to initialize an array, and having control of
    the same kind over the object pointed at by p.
     
    James Kuyper, Aug 31, 2013
    #6
  7. qak

    Eric Sosman Guest

    Let's also note that even if the unaligned comparison
    actually compares, it almost certainly does *not* do what
    the O.P. wanted.

    (Hint: Is the O.P.'s machine likely to be Big-Endian?)
     
    Eric Sosman, Aug 31, 2013
    #7
  8. No, indeed you can't (as you know). The presence of the assembler,
    though, suggests that the architecture (and the poster) won't care.

    You can, however, get round half of the problem using a compound literal
    that is a union. I won't show any code because it does not solve the
    problem for 'p' and, as I said, memcmp is the first port of call for this.
     
    Ben Bacarisse, Aug 31, 2013
    #8
  9. qak

    Willem Guest

    qak wrote:
    ) Equivalent to assemply:
    ) cmp [esi], 'Test'
    ) In C:
    ) // slow and long
    ) if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...

    Have you checked if the above code doesn't get optimized into a single
    compare?

    ) // have to look up Ascii table, hard to change
    ) if (*(int*)p == 0x54657374)...
    ) Please have a better solution, thanks.

    The 'look up in ascii table' can be turned into a macro, you know.
    There are alignment issues and integer-size issues, but those can be
    overcome with suitable types and macros.


    SaSW, Willem
    --
    Disclaimer: I am in no way responsible for any of the statements
    made in the above text. For all I know I might be
    drugged or something..
    No I'm not paranoid. You all think I'm paranoid, don't you !
    #EOT
     
    Willem, Aug 31, 2013
    #9
  10. In the early days of C, as I understand, compilers allowed multiple
    character character constants, such that it would be:

    if(*(int*)p=='Test')

    As far as I remember, that was for 16 bit machines, so only two
    characters, but it could extend to four.

    -- glen
     
    glen herrmannsfeldt, Aug 31, 2013
    #10
  11. (snip)
    Some years ago, I tried to use string constants and macros for
    switch/case, which failed as they are not constant expressions.

    It should work with character constants, though.

    -- glen
     
    glen herrmannsfeldt, Aug 31, 2013
    #11
  12. qak

    Eric Sosman Guest

    Multi-character constants are still permitted, but they're
    as useless today as they've always been. 6.4.4.4p10:

    "The value of an integer character constant
    containing more than one character (e.g., 'ab'),
    [...] is implementation-defined."

    That is, *if* you just happen to know what the compiler of the
    moment does with such a construct and *if* it just happens to
    do what you want, then 'ab' or 'Test' or 'Stupidity alert!' may
    may be just the thing you should write.

    ... and the *next* thing you'll write will be your signature
    on the Involuntary Separation Agreement with your former employer.
     
    Eric Sosman, Aug 31, 2013
    #12
  13. It's in the standard, but the result is, as you'd expect, implementation
    defined.

    I think it's quite common to support some reasonable interpretation of
    multi-character constants up to the width of int (the type of a '...'
    constant), but in every case that I can remember (only a handful, mind
    you) the resulting constant is "big-endian", i.e. '1234' has the value
    ('1' << 24) + ('2' << 16) + ('3' << 8) + '4'. That probably won't do
    for the OP, quite apart from the other issues.

    I didn't make enough if it my post, by memcmp is the way to go --
    nothing else comes close in terms of correctness, portability and
    clarity. Other tricks might need to be used in special circumstances,
    but that will be the result of timing tests.
     
    Ben Bacarisse, Aug 31, 2013
    #13
  14. qak

    Les Cargill Guest


    I've practically stopped using "switch" in favor of "arrays of code":
    array set codearray {
    one { puts ein }
    two { puts zwei }
    }

    set tag one

    eval $codearray($tag)
     
    Les Cargill, Aug 31, 2013
    #14
  15. qak

    Les Cargill Guest


    Thought this was the Tcl group - you can't use strings as switch indices
    in 'C'....
     
    Les Cargill, Aug 31, 2013
    #15
  16. qak

    Tim Rentsch Guest

    A wonderful little example. Of the proposed solutions, none
    exactly match the semantics of "slow and long", and all have
    undefined behavior or potential undefined behavior of one
    kind or another:

    1. As others have pointed out, converting a (char *) pointer
    to (int *) may have undefined behavior because of alignment
    problems.

    2. Assuming there are no alignment problems, accessing parts
    of character arrays as (non-character) integer types is still
    undefined behavior, because such accesses do not meet the
    conditions required by effective type rules.

    3. The safest solutions proposed so far are those using
    memcmp(). However, these too have undefined behavior if
    the area pointed to by 'p' doesn't have at least four
    characters remaining. (This problem also applies to
    all cases where integer comparison is done.)

    Two alternate approaches:

    Solution A:
    Assuming p points to a string (or the beginning of a string)
    that might be less than four characters long, the easiest
    safe test is with strncmp(), namely,

    strncmp( p, "Test", 4 ) == 0

    When p is known to point to a string, strncmp() is safer
    than memcmp() as it won't read past the null terminating
    byte of the string (and memcmp() may). (In the examples
    I looked at the generated code for this looked okay,
    meaning mainly that there was no call to the library
    function - the compiler was smart enough to do things
    inline, which wasn't always true of memcmp().)

    Solution B:
    If you can count on p pointing to an area of at least four
    characters, and want to use the integer comparison trick,
    this can be done safely with memcpy() and a union type,
    illustrated here with a stand-alone function:

    int
    is_Test( char p[4] ){
    union {
    char s[ sizeof(uint32_t) ];
    uint32_t u;
    } it, test = { "Test" };
    memcpy( it.s, p, 4 );
    return it.u == test.u;
    }

    I should add that under these conditions the 'memcmp()'
    comparison could be used instead. The code shown here may be
    faster than memcmp(), but that depends on a lot of different
    things, and should be measured if it's important to be sure.
    (The examples I looked at seemed to have better generated
    code for this approach than memcmp() did, but I didn't take
    any measurements.)

    Note the declaration of 'p' in the function parameter list
    documents 'p' as being at least four (char) elements long.

    If neither of the pre-conditions under (A) or (B) is met,
    there is no safe way to match "slow and long" that is better
    than what you have written already under "slow and long".
     
    Tim Rentsch, Aug 31, 2013
    #16
  17. And that's going to be faster than memcmp() or strncmp()?
     
    Keith Thompson, Aug 31, 2013
    #17
  18. I'd expect the byte ordering of a string literal to match the byte
    ordering of whatever 'Test' means in the assembly instruction:

    cmp [esi], 'Test'

    I'm not sure which CPU the OP is using, but if it's x86 then alignment
    likely isn't going to be a problem (though of course the code will be
    non-portable).

    qak: If performance is really that critical, you might consider using
    inline assembly if your compiler supports it.

    If you *think* performance is that critical, how do you know? Have you
    measured it? Have you verified that the time you'll spend figuring this
    out will pay off in faster execution time? Perhaps it will, but many
    claims about performance requirements turn out to be overstated.
     
    Keith Thompson, Aug 31, 2013
    #18
  19. qak

    James Kuyper Guest

    If you can arrange for the memory to be correctly aligned from the
    beginning, it's probably no slower, and may be faster. If you have to
    copy the pointed-at object into a correctly aligned buffer, it's
    definitely slower.
     
    James Kuyper, Aug 31, 2013
    #19
  20. qak

    Tim Rentsch Guest

    The use of && operators makes this unlikely, because of
    short-circuiting semantics.

    Using & rather than && would make it somewhat more likely,
    but neither resulted in a single compare instruction with
    the two compilers I tested (both versions of gcc).
     
    Tim Rentsch, Aug 31, 2013
    #20
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.