Strange strcmp() action?

E

Eric

Consider this...

char abc[ 10 ] = "qwertyu"; // note abc doesn't get filled

int result = strcmp( &abc[1], "wer" );

"result" should evaluate to zero, right? (... comparing "wer" to the
contents of the string abc starting at position 1...)

It doesn't... it evaluates to 1.

For that matter...

int result = strcmp( abc, "qwe" );

.... also evaluates to 1.

If I say...

char *abc = "qwertyu";

.... strcmp still evaluates to 1 in both examples.

On the other hand...

int result = strncmp( &abc[1], "wer", 3);

.... evaluates to zero just like it should.

I think my computer is haunted... :)
 
J

JC

Consider this...

char abc[ 10 ] = "qwertyu"; // note abc doesn't get filled

int result = strcmp( &abc[1], "wer" );

"result" should evaluate to zero, right?  (... comparing "wer" to the
contents of the string abc starting at position 1...)

No. You must have misread some documentation somewhere. The strcmp()
function compares the entire string.

The string &abc[1] is "wertyu", which is not "wer". The function strcmp
() returns 0 if the strings are equal, "wertyu" and "wer" are not
equal. It returns 1 because "wertyu" comes after "wer" alphabetically.

It doesn't... it evaluates to 1.

As it should, since the input strings are not identical.

For that matter...

int result = strcmp( abc, "qwe" );

... also evaluates to 1.

This is because "qwertyu" comes after "qwe". The strings are not
identical.

If I say...

char *abc = "qwertyu";

... strcmp still evaluates to 1 in both examples.

On the other hand...

int result = strncmp( &abc[1], "wer", 3);

... evaluates to zero just like it should.


This is because strncmp() compares only the number of characters you
specify, while strcmp() compares the entire strings.

I think my computer is haunted... :)


While that may still be true, strcmp() and strncmp() appear to be
working correctly.


HTH,
Jason
 
E

Eric

No. You must have misread some documentation somewhere. The strcmp()
function compares the entire string.

The string &abc[1] is "wertyu", which is not "wer". The function strcmp
() returns 0 if the strings are equal, "wertyu" and "wer" are not
equal. It returns 1 because "wertyu" comes after "wer" alphabetically.

Hmmm... OK.

http://www.cplusplus.com/reference/clibrary/cstring/strcmp.html says:

"Compares the C string str1 to the C string str2.
This function starts comparing the first character of each string. If
they are equal to each other, it continues with the following pairs
until the characters differ or until a terminanting null-character is
reached."

To me, that seems to say that if a null terminator is found, as it
would be at the end of "wer" in str2, then if the strings are equal up
to that point, they are equal.

It doesn't say anything about comparing the terminating null character
in str2 with the corresponding character in str1 and finding that they
are not equal (because the corresponding character in str1 is "t").

I know it doesn't say this explicitly, and I can't find anything else,
including "man strcmp" for both Linux and FreeBSD, that is explicit
about it.

Anyway, that's the documentation I (mis)read.

Thanks... :)
 
K

Keith Thompson

Eric said:
No. You must have misread some documentation somewhere. The strcmp()
function compares the entire string.

The string &abc[1] is "wertyu", which is not "wer". The function strcmp
() returns 0 if the strings are equal, "wertyu" and "wer" are not
equal. It returns 1 because "wertyu" comes after "wer" alphabetically.

Hmmm... OK.

http://www.cplusplus.com/reference/clibrary/cstring/strcmp.html says:

"Compares the C string str1 to the C string str2.
This function starts comparing the first character of each string. If
they are equal to each other, it continues with the following pairs
until the characters differ or until a terminanting null-character is
reached."

To me, that seems to say that if a null terminator is found, as it
would be at the end of "wer" in str2, then if the strings are equal up
to that point, they are equal.

It doesn't say anything about comparing the terminating null character
in str2 with the corresponding character in str1 and finding that they
are not equal (because the corresponding character in str1 is "t").

I know it doesn't say this explicitly, and I can't find anything else,
including "man strcmp" for both Linux and FreeBSD, that is explicit
about it.

Here's what the C99 standard says.

7.21.4:

The sign of a nonzero value returned by the comparison functions
memcmp, strcmp, and strncmp is determined by the sign of the
difference between the values of the first pair of characters
(both interpreted as unsigned char) that differ in the objects
being compared.

7.21.4.2:

The strcmp function returns an integer greater than, equal to, or
less than zero, accordingly as the string pointed to by s1 is
greater than, equal to, or less than the string pointed to by s2.

A "string" is, by the definition in 7.1.1p1:

a contiguous sequence of characters terminated by and including
the first null character.

So the terminating null character is part of the string, and is
compared if a mismatch isn't found sooner. In comparing the strings
"abc" and "abcd", the characters compared are:

'a' vs. 'a' (equal, keep looking)
'b' vs. 'b' (equal, keep looking)
'c' vs. 'c' (equal, keep looking)
'\0' vs. 'd' (less, return a negative result)

In my opinion it would be nice if this were made a bit more explicit.
 
J

JC

No. You must have misread some documentation somewhere. The strcmp()
function compares the entire string.
The string &abc[1] is "wertyu", which is not "wer". The function strcmp
() returns 0 if the strings are equal, "wertyu" and "wer" are not
equal. It returns 1 because "wertyu" comes after "wer" alphabetically.

Hmmm... OK.

http://www.cplusplus.com/reference/clibrary/cstring/strcmp.htmlsays:

"Compares the C string str1 to the C string str2.
This function starts comparing the first character of each string. If
they are equal to each other, it continues with the following pairs
until the characters differ or until a terminating null-character is
reached."

I can see how that could be misread, it is easily misleading.

OTOH, that paragraph states that it stops when a terminating null is
encountered, but does not say what it's return value is upon stopping.
If strcmp() returned 0 for *every* input (which, of course, is
entirely incorrect), that paragraph still holds true, so it's not
enough to information to infer that strcmp() would return 0 if equal
up to a null.

But yes, it's unfortunately vague.
To me, that seems to say that if a null terminator is found, as it
would be at the end of "wer" in str2, then if the strings are equal up
to that point, they are equal.

It doesn't say anything about comparing the terminating null character
in str2 with the corresponding character in str1 and finding that they
are not equal (because the corresponding character in str1 is "t").

It also does not say that 0 is returned when a null is hit. Again,
just picking nits. I definitely see your point.

FYI, if you do need to compare strings in the way you thought strcmp()
worked, you could use strncmp like so (untested):

int prefixcmp (const char *s1, const char *s2) {
int n1 = strlen(s1);
int n2 = strlen(s2);
return strncmp(s1, s2, (n1 < n2) ? n1 : n2);
}


I know it doesn't say this explicitly, and I can't find anything else,
including "man strcmp" for both Linux and FreeBSD, that is explicit
about it.

Anyway, that's the documentation I (mis)read.

Thanks... :)


Jason
 
P

Phil Carmody

Eric said:
Consider this...

char abc[ 10 ] = "qwertyu"; // note abc doesn't get filled

int result = strcmp( &abc[1], "wer" );

"result" should evaluate to zero, right? (... comparing "wer" to the
contents of the string abc starting at position 1...)

Do you really think that "wertyu" is equal to "wer"?
It doesn't... it evaluates to 1.

For that matter...

int result = strcmp( abc, "qwe" );

... also evaluates to 1.

Do you really think that "qwertyu" is equal to "qwe"?
If I say...

char *abc = "qwertyu";

... strcmp still evaluates to 1 in both examples.

On the other hand...

int result = strncmp( &abc[1], "wer", 3);

... evaluates to zero just like it should.

And like strcmp shouldn't.
I think my computer is haunted... :)

Exorcise the ghost of the manpages for strcmp please.

Phil
 
J

JC

No. You must have misread some documentation somewhere. The strcmp()
function compares the entire string.
The string &abc[1] is "wertyu", which is not "wer". The function strcmp
() returns 0 if the strings are equal, "wertyu" and "wer" are not
equal. It returns 1 because "wertyu" comes after "wer" alphabetically.
Hmmm... OK.

"Compares the C string str1 to the C string str2.
This function starts comparing the first character of each string. If
they are equal to each other, it continues with the following pairs
until the characters differ or until a terminating null-character is
reached."

I can see how that could be misread, it is easily misleading.

OTOH, that paragraph states that it stops when a terminating null is
encountered, but does not say what it's return value is upon stopping.
If strcmp() returned 0 for *every* input (which, of course, is
entirely incorrect), that paragraph still holds true, so it's not
enough to information to infer that strcmp() would return 0 if equal
up to a null.

But yes, it's unfortunately vague.


Adding to this, what you *should* have referred to was the "return
value" documentation on that same page, which does clearly state the
overall behavior:

"A zero value indicates that both strings are equal.
A value greater than zero indicates that the first character that does
not match has a greater value in str1 than in str2; And a value less
than zero indicates the opposite."

When it says both strings are "equal", it means it in the expected
sense of the word, it is not designed to confuse. The strings
"qwertyu" and "qwe" are, of course, not equal.

It also does not say that 0 is returned when a null is hit. Again,
just picking nits. I definitely see your point.

FYI, if you do need to compare strings in the way you thought strcmp()
worked, you could use strncmp like so (untested):

int prefixcmp (const char *s1, const char *s2) {
  int n1 = strlen(s1);
  int n2 = strlen(s2);
  return strncmp(s1, s2, (n1 < n2) ? n1 : n2);

}


Jason
 
C

CBFalconer

Eric said:
Consider this...

char abc[ 10 ] = "qwertyu"; // note abc doesn't get filled

int result = strcmp( &abc[1], "wer" );

"result" should evaluate to zero, right? (... comparing "wer" to
the contents of the string abc starting at position 1...)

It doesn't... it evaluates to 1.

No. The string abc starts at position zero, with the char 'q'. It
continues up to the char 'u'.
 
C

CBFalconer

Eric said:
JC said:
No. You must have misread some documentation somewhere. The
strcmp() function compares the entire string.

The string &abc[1] is "wertyu", which is not "wer". The function
strcmp () returns 0 if the strings are equal, "wertyu" and "wer"
are not equal. It returns 1 because "wertyu" comes after "wer"
alphabetically.

Hmmm... OK.

http://www.cplusplus.com/reference/clibrary/cstring/strcmp.html says:

"Compares the C string str1 to the C string str2.
This function starts comparing the first character of each string.
If they are equal to each other, it continues with the following
pairs until the characters differ or until a terminanting null-
character is reached."

To me, that seems to say that if a null terminator is found, as it
would be at the end of "wer" in str2, then if the strings are equal
up to that point, they are equal.

In general I advise using the actual C standard to define the
standard functions. Those are the things marked 'C99' below.
n869_txt is bz2 compressed, and useful with text utilities.

Some useful references about C:
<http://www.ungerhu.com/jxh/clc.welcome.txt>
<http://c-faq.com/> (C-faq)
<http://benpfaff.org/writings/clc/off-topic.html>
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf> (C99)
<http://cbfalconer.home.att.net/download/n869_txt.bz2> (pre-C99)
<http://www.dinkumware.com/c99.aspx> (C-library}
<http://gcc.gnu.org/onlinedocs/> (GNU docs)
<http://clc-wiki.net/wiki/C_community:comp.lang.c:Introduction>
 
C

CBFalconer

pete said:
CBFalconer said:
Eric said:
Consider this...

char abc[ 10 ] = "qwertyu"; // note abc doesn't get filled

int result = strcmp( &abc[1], "wer" );

"result" should evaluate to zero, right? (... comparing "wer"
to the contents of the string abc starting at position 1...)

It doesn't... it evaluates to 1.

No. The string abc starts at position zero, with the char 'q'.
It continues up to the char 'u'.

All strings continue up to the char '\0'.

But they don't contain the char '\0'. That is purely an end mark.
 
M

MisterE

I think my computer is haunted... :)

You seem to be trying to use strcmp to do something that you should be doing
with strstr.

strcmp compares whole strings, if the strings are different lengths, they
they will of course return not zero as they are not the same, strstr finds
one string in another.
 
K

Keith Thompson

CBFalconer said:
pete wrote: [...]
All strings continue up to the char '\0'.

But they don't contain the char '\0'. That is purely an end mark.

See the definition of "string", C99 7.1.1p1.

Some confusion is caused by the fact that the "length" of string
doesn't include the terminating '\0', but the string itself does.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top