More efficient strcmp

P

pembed2012

Dear friends

I have created a more efficient strcmp function. However it only works on
big endian architectures. I would like someone with same to run some
timing tests to see how much is the speed up.

Thank you

int strcmp_fast(char* s,char* t){
uls=(unsigned long*)s;
ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}
 
K

Kleuske

Dear friends

I have created a more efficient strcmp function. However it only works
on big endian architectures. I would like someone with same to run some
timing tests to see how much is the speed up.

Thank you

int strcmp_fast(char* s,char* t){
uls=(unsigned long*)s;
ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}

What if the strings are longer than an unsigned long and differ only in the
last bit?

Just asking...
 
J

James Kuyper

Dear friends

I have created a more efficient strcmp function. However it only works on
big endian architectures. I would like someone with same to run some
timing tests to see how much is the speed up.

Thank you

int strcmp_fast(char* s,char* t){
uls=(unsigned long*)s;
ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}

char strings[0] = "a\0a\0b";

Compare the results of strcmp(strings, strings+2) and
strcmp_fast(strings, strings+2).

Fast is nice; defined behavior is better; correct behavior is even better.
 
S

Stefan Ram

pembed2012 said:
I have created a more efficient strcmp function.

What did you compare its efficiency with?
How did you measure that it is more efficient?
 
S

Stephen Sprunk

I have created a more efficient strcmp function. However it only works on
big endian architectures. I would like someone with same to run some
timing tests to see how much is the speed up.

Thank you

int strcmp_fast(char* s,char* t){
uls=(unsigned long*)s;
ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}

You can make lots of things faster if you aren't worried about the code
crashing or giving incorrect answers when called with valid inputs.

However, that is a rather dubious definition of "more efficient".

Note that any reasonable strcmp() implementation _does_ use the
multi-byte comparison trick; however, unlike your code, they also ensure
that the pointers are aligned, strings are not compared past their ends,
etc. so that the results are _correct_, and that does slow things down a
bit. Ditto for memcmp().

S
 
J

James Kuyper

What if the strings are longer than an unsigned long and differ only in the
last bit?

Then it ends up calling strcmp(s,t), and ends up actually being slower.
That's not one of the several serious problems with this code:

1. Either s or t might not have an alignment suitable for conversion to
unsigned long*. Behavior: undefined.
2. *uls or *ult each access sizeof(unsigned long) bytes; if the strings
pointed at are shorter than that, those expressions might attempt to
access memory that the program doesn't have permission to read.
Behavior: undefined.
3. The comparison compares the entire sizeof(unsigned long) bytes, even
if that includes bytes past the end of the strings being compared.
Behavior: incorrect
4. In the unlikely case that unsigned long has padding bits, they won't
participate in the comparison. Behavior: incorrect.
 
B

Ben Bacarisse

pembed2012 said:
I have created a more efficient strcmp function. However it only works on
big endian architectures. I would like someone with same to run some
timing tests to see how much is the speed up.

Without having done tests, aren't you jumping the gun by saying it's more
efficient?
int strcmp_fast(char* s,char* t){
uls=(unsigned long*)s;
ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}

The trouble is it doesn't work, regardless of the machine's byte
ordering. There are two big issues. First, a pointer to a string won't
always be correctly aligned for a larger access, and, second:

char test[5] = {0, 0, 1};
strcmp_fast(test, test+1);

By the way, the Campaign to Remove All Spaces from Source would no
approve of your half-hearted effort -- four whole spaces have survived
the delete key!
 
J

jacob navia

Le 13/03/12 23:19, James Kuyper a écrit :
char strings[0] = "a\0a\0b";

?????

An array of zero chars?
And you assign a char * to it?

Can't parse that, sorry.

Maybe you wanted to write:

char *strings = "a\0a\0b";
 
K

Kaz Kylheku

Dear friends

I have created a more efficient strcmp function.

Gee, thanks a lot.

Now, you wouldn't happen to have a more efficient paper towel for wiping coffee
spit from a monitor? :)
 
K

Kaz Kylheku

What if the strings are longer than an unsigned long and differ only in the
last bit?

More importantly, what if they are shorter?

What if the compiler doesn't like the aliasing?

Also, this fails on x86 and elsewhere, because strings, if treated as numbers this way, are
big-endian.
 
B

BartC

int strcmp_fast(char* s,char* t){
uls=(unsigned long*)s;
ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}

That won't work well for various reasons already explained.

Try inventing a fast strcmp_eq() function instead; usually I only care about
equality, not whether one string is more or less than other.

That also means that endianness doesn't come into it so much. And if the
caller happens to have the string lengths already available, have a version
that makes use of that (which can be really fast when the lengths are
different).

However you'll have a job beating a good built-in strcmp(), unless you do
already have the lengths, to give you an edge.
 
B

Bl0ckeduser

pembed2012 said:
Dear friends

I have created a more efficient strcmp function. However it only works on
big endian architectures. I would like someone with same to run some
timing tests to see how much is the speed up.

Thank you

int strcmp_fast(char* s,char* t){
uls=(unsigned long*)s;
ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}

This slightly modified version also works on a little-endian machine,
but I'm not sure if it's quite as fast as yours.

int strcmp_recursive(char* s,char* t){
if(!*s && *s==*t) return(0);
if(*s>*t) return(1);
if(*s<*t) return(-1);
return(strcmp_recursive(s+1,t+1));
}
 
J

James Kuyper

char strings[0] = "a\0a\0b";

Typo:

char strings[] = "a\0a\0b"; // mutable
const char *strings = "a\0a\0b"; // immutable

Aiee! Where'd that first 0 come from? I don't remember typing it.
I meant my example to be mutable, since, for some reason, strcmp_fast()
takes char* arguments.
 
A

Andre

Dear friends

I have created a more efficient strcmp function. However it only works on
big endian architectures. I would like someone with same to run some
timing tests to see how much is the speed up.

Thank you

int strcmp_fast(char* s,char* t){
uls=(unsigned long*)s;
ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}

A string (or at least a substring) may start at an odd address, and some
hardware architectures will fail if you try to cast this to a long...
 
F

Fritz Wuehler

Kaz Kylheku said:
Gee, thanks a lot.

Now, you wouldn't happen to have a more efficient paper towel for wiping coffee
spit from a monitor? :)

if your monitor is spitting coffee may i suggest perhaps your monitor is
actually a faulty coffee maker? in case it's not then microfiber rags work a
treat for that, ask me how i know ;-)

watching a dope get his arse handed to him by the pros here is one of the
best things about this newsgroup. high in content and amusement all at once.
 
T

Tim Rentsch

Bl0ckeduser said:
This slightly modified version also works on a little-endian machine,
but I'm not sure if it's quite as fast as yours.

int strcmp_recursive(char* s,char* t){
if(!*s && *s==*t) return(0);
if(*s>*t) return(1);
if(*s<*t) return(-1);
return(strcmp_recursive(s+1,t+1));
}

You have made an important semantic change (not counting the
changes that allow termination and provide for returning 0
if the strings are equal). Do you see what it is? Hint:
this version of strcmp does not match the specification for
strcmp() in the standard library (even ignoring the 'const'
modifiers in strcmp()'s prototype).

By the way, 'return' statements don't need parentheses
around the return expression. The statements

return 0;
return strcmp_recursive( s+1, t+1 );

will work just fine.
 
B

Bl0ckeduser

Tim said:
You have made an important semantic change (not counting the
changes that allow termination and provide for returning 0
if the strings are equal). Do you see what it is? Hint:
this version of strcmp does not match the specification for
strcmp() in the standard library (even ignoring the 'const'
modifiers in strcmp()'s prototype).

Hmm, I have to do the comparisons using the unsigned char type (or
perhaps another unsigned type, like the OP), right ?
By the way, 'return' statements don't need parentheses
around the return expression. The statements

return 0;
return strcmp_recursive( s+1, t+1 );

will work just fine.

By the way, thanks for reviewing my code :). As a hobbyist, it's always
a thrill to talk with experts.
 
V

vaysagekv

Dear friends

I have created a more efficient strcmp function. However it only works on
big endian architectures. I would like someone with same to run some
timing tests to see how much is the speed up.

Thank you

int strcmp_fast(char* s,char* t){
uls=(unsigned long*)s;
ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}
Do you mean
int strcmp_fast(char* s,char* t){
unsigned long* uls=(unsigned long*)s;
unsigned long* ult=(unsigned long*)t;
if(*uls>*ult) return(1);
if(*uls<*ult) return(-1);
else return(strcmp(s,t));
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top