matching assembly speed for small string comparison

Q

qak

Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.
 
E

Eric Sosman

Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

#include <string.h>
char *p = "Testosterone";
if (memcmp(p, "Test", 4) == 0) ...

(At first I wrote a strcmp() call, then realized you'd asked
for something different.)
 
B

Ben Bacarisse

qak said:
Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

I'd second memcmp. Try it and measure the speed (not of the test --
that will be quite hard and not very informative -- but of your
program). A lot of maintainable C code has been sacrificed on the alter
of efficiency.

If you still want to avoid the standard idiom (memcmp) then you can
avoid the gruesome constant simply by using a string literal:

if (*(int32_t *)p == *(int32_t *)"Test") ...

Using int32_t makes the intent a little clearer.
 
B

Barry Schwarz

Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

On many systems, int requires alignment (frequently a multiple of 4).
String literals and character arrays do not have a similar
requirement. Any attempt to convert a char* to an int* when the value
of the char* does not meet the alignment for an int* will invoke
undefined behavior.
 
B

Barry Schwarz

qak said:
Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

I'd second memcmp. Try it and measure the speed (not of the test --
that will be quite hard and not very informative -- but of your
program). A lot of maintainable C code has been sacrificed on the alter
of efficiency.

If you still want to avoid the standard idiom (memcmp) then you can
avoid the gruesome constant simply by using a string literal:

if (*(int32_t *)p == *(int32_t *)"Test") ...

Using int32_t makes the intent a little clearer.

How do you insure that the value of p and the address of the string
literal are suitably aligned for int. If either is not, the statement
invokes undefined behavior.
 
J

James Kuyper

On Sat, 31 Aug 2013 14:41:24 +0100, Ben Bacarisse


How do you insure that the value of p and the address of the string
literal are suitably aligned for int. If either is not, the statement
invokes undefined behavior.

By using _Alignas(int32_t), which is new in C2011. This will require
using the string literal to initialize an array, and having control of
the same kind over the object pointed at by p.
 
E

Eric Sosman

Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

On many systems, int requires alignment (frequently a multiple of 4).
String literals and character arrays do not have a similar
requirement. Any attempt to convert a char* to an int* when the value
of the char* does not meet the alignment for an int* will invoke
undefined behavior.

Let's also note that even if the unaligned comparison
actually compares, it almost certainly does *not* do what
the O.P. wanted.

(Hint: Is the O.P.'s machine likely to be Big-Endian?)
 
B

Ben Bacarisse

Barry Schwarz said:
qak said:
Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

I'd second memcmp. Try it and measure the speed (not of the test --
that will be quite hard and not very informative -- but of your
program). A lot of maintainable C code has been sacrificed on the alter
of efficiency.

If you still want to avoid the standard idiom (memcmp) then you can
avoid the gruesome constant simply by using a string literal:

if (*(int32_t *)p == *(int32_t *)"Test") ...

Using int32_t makes the intent a little clearer.

How do you insure that the value of p and the address of the string
literal are suitably aligned for int. If either is not, the statement
invokes undefined behavior.

No, indeed you can't (as you know). The presence of the assembler,
though, suggests that the architecture (and the poster) won't care.

You can, however, get round half of the problem using a compound literal
that is a union. I won't show any code because it does not solve the
problem for 'p' and, as I said, memcmp is the first port of call for this.
 
W

Willem

qak wrote:
) Equivalent to assemply:
) cmp [esi], 'Test'
) In C:
) // slow and long
) if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...

Have you checked if the above code doesn't get optimized into a single
compare?

) // have to look up Ascii table, hard to change
) if (*(int*)p == 0x54657374)...
) Please have a better solution, thanks.

The 'look up in ascii table' can be turned into a macro, you know.
There are alignment issues and integer-size issues, but those can be
overcome with suitable types and macros.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
G

glen herrmannsfeldt

qak said:
Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

In the early days of C, as I understand, compilers allowed multiple
character character constants, such that it would be:

if(*(int*)p=='Test')

As far as I remember, that was for 16 bit machines, so only two
characters, but it could extend to four.

-- glen
 
G

glen herrmannsfeldt

(snip)
) // have to look up Ascii table, hard to change
) if (*(int*)p == 0x54657374)...
) Please have a better solution, thanks.
The 'look up in ascii table' can be turned into a macro, you know.
There are alignment issues and integer-size issues, but those can be
overcome with suitable types and macros.

Some years ago, I tried to use string constants and macros for
switch/case, which failed as they are not constant expressions.

It should work with character constants, though.

-- glen
 
E

Eric Sosman

qak said:
Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

In the early days of C, as I understand, compilers allowed multiple
character character constants, such that it would be:

if(*(int*)p=='Test')

As far as I remember, that was for 16 bit machines, so only two
characters, but it could extend to four.

Multi-character constants are still permitted, but they're
as useless today as they've always been. 6.4.4.4p10:

"The value of an integer character constant
containing more than one character (e.g., 'ab'),
[...] is implementation-defined."

That is, *if* you just happen to know what the compiler of the
moment does with such a construct and *if* it just happens to
do what you want, then 'ab' or 'Test' or 'Stupidity alert!' may
may be just the thing you should write.

... and the *next* thing you'll write will be your signature
on the Involuntary Separation Agreement with your former employer.
 
B

Ben Bacarisse

glen herrmannsfeldt said:
qak said:
Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

In the early days of C, as I understand, compilers allowed multiple
character character constants, such that it would be:

if(*(int*)p=='Test')

As far as I remember, that was for 16 bit machines, so only two
characters, but it could extend to four.

It's in the standard, but the result is, as you'd expect, implementation
defined.

I think it's quite common to support some reasonable interpretation of
multi-character constants up to the width of int (the type of a '...'
constant), but in every case that I can remember (only a handful, mind
you) the resulting constant is "big-endian", i.e. '1234' has the value
('1' << 24) + ('2' << 16) + ('3' << 8) + '4'. That probably won't do
for the OP, quite apart from the other issues.

I didn't make enough if it my post, by memcmp is the way to go --
nothing else comes close in terms of correctness, portability and
clarity. Other tricks might need to be used in special circumstances,
but that will be the result of timing tests.
 
L

Les Cargill

glen said:
(snip)


Some years ago, I tried to use string constants and macros for
switch/case, which failed as they are not constant expressions.

It should work with character constants, though.

-- glen


I've practically stopped using "switch" in favor of "arrays of code":
array set codearray {
one { puts ein }
two { puts zwei }
}

set tag one

eval $codearray($tag)
 
L

Les Cargill

Les said:
I've practically stopped using "switch" in favor of "arrays of code":
array set codearray {
one { puts ein }
two { puts zwei }
}

set tag one

eval $codearray($tag)


Thought this was the Tcl group - you can't use strings as switch indices
in 'C'....
 
T

Tim Rentsch

qak said:
Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

A wonderful little example. Of the proposed solutions, none
exactly match the semantics of "slow and long", and all have
undefined behavior or potential undefined behavior of one
kind or another:

1. As others have pointed out, converting a (char *) pointer
to (int *) may have undefined behavior because of alignment
problems.

2. Assuming there are no alignment problems, accessing parts
of character arrays as (non-character) integer types is still
undefined behavior, because such accesses do not meet the
conditions required by effective type rules.

3. The safest solutions proposed so far are those using
memcmp(). However, these too have undefined behavior if
the area pointed to by 'p' doesn't have at least four
characters remaining. (This problem also applies to
all cases where integer comparison is done.)

Two alternate approaches:

Solution A:
Assuming p points to a string (or the beginning of a string)
that might be less than four characters long, the easiest
safe test is with strncmp(), namely,

strncmp( p, "Test", 4 ) == 0

When p is known to point to a string, strncmp() is safer
than memcmp() as it won't read past the null terminating
byte of the string (and memcmp() may). (In the examples
I looked at the generated code for this looked okay,
meaning mainly that there was no call to the library
function - the compiler was smart enough to do things
inline, which wasn't always true of memcmp().)

Solution B:
If you can count on p pointing to an area of at least four
characters, and want to use the integer comparison trick,
this can be done safely with memcpy() and a union type,
illustrated here with a stand-alone function:

int
is_Test( char p[4] ){
union {
char s[ sizeof(uint32_t) ];
uint32_t u;
} it, test = { "Test" };
memcpy( it.s, p, 4 );
return it.u == test.u;
}

I should add that under these conditions the 'memcmp()'
comparison could be used instead. The code shown here may be
faster than memcmp(), but that depends on a lot of different
things, and should be measured if it's important to be sure.
(The examples I looked at seemed to have better generated
code for this approach than memcmp() did, but I didn't take
any measurements.)

Note the declaration of 'p' in the function parameter list
documents 'p' as being at least four (char) elements long.

If neither of the pre-conditions under (A) or (B) is met,
there is no safe way to match "slow and long" that is better
than what you have written already under "slow and long".
 
K

Keith Thompson

James Kuyper said:
By using _Alignas(int32_t), which is new in C2011. This will require
using the string literal to initialize an array, and having control of
the same kind over the object pointed at by p.

And that's going to be faster than memcmp() or strncmp()?
 
K

Keith Thompson

Eric Sosman said:
Equivalent to assemply:
cmp [esi], 'Test'
In C:
// slow and long
if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...
// have to look up Ascii table, hard to change
if (*(int*)p == 0x54657374)...
Please have a better solution, thanks.

On many systems, int requires alignment (frequently a multiple of 4).
String literals and character arrays do not have a similar
requirement. Any attempt to convert a char* to an int* when the value
of the char* does not meet the alignment for an int* will invoke
undefined behavior.

Let's also note that even if the unaligned comparison
actually compares, it almost certainly does *not* do what
the O.P. wanted.

(Hint: Is the O.P.'s machine likely to be Big-Endian?)

I'd expect the byte ordering of a string literal to match the byte
ordering of whatever 'Test' means in the assembly instruction:

cmp [esi], 'Test'

I'm not sure which CPU the OP is using, but if it's x86 then alignment
likely isn't going to be a problem (though of course the code will be
non-portable).

qak: If performance is really that critical, you might consider using
inline assembly if your compiler supports it.

If you *think* performance is that critical, how do you know? Have you
measured it? Have you verified that the time you'll spend figuring this
out will pay off in faster execution time? Perhaps it will, but many
claims about performance requirements turn out to be overstated.
 
J

James Kuyper

And that's going to be faster than memcmp() or strncmp()?

If you can arrange for the memory to be correctly aligned from the
beginning, it's probably no slower, and may be faster. If you have to
copy the pointed-at object into a correctly aligned buffer, it's
definitely slower.
 
T

Tim Rentsch

Willem said:
qak wrote:
) Equivalent to assemply:
) cmp [esi], 'Test'
) In C:
) // slow and long
) if(p[0] == 'T' && p[1] == 'e' && p[2] == 's' && p[3] == 't')...

Have you checked if the above code doesn't get optimized into a single
compare?

The use of && operators makes this unlikely, because of
short-circuiting semantics.

Using & rather than && would make it somewhat more likely,
but neither resulted in a single compare instruction with
the two compilers I tested (both versions of gcc).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,044
Latest member
RonaldNen

Latest Threads

Top