Question on strncmp / strnicmp use

Richard · Jan 16, 2009

Barry Schwarz said:
Why do you think that? I don't see any undefined behavior if the
strings are well formed.

I neglected to add, thinking it was obvious and I was wrong as it isn't
so obvious, "same length and same strings". The comparison loop would go past
the nul.

Fred · Jan 16, 2009

Barry said:
Barry said:

(e-mail address removed) writes:
[...]
int mycomparison(const char *s1, const char *s2) {
for(; *s1 == *s2; s1++, s2++);
if(*s1 != *s2 && (*s1 == 0 || *s2 == 0)) return 0;
if((unsigned char)*s1 < *s2) return -1; else return (unsigned char)
*s1 > *s2;
}
In C there are a few basic tests one does on each and every function
involving strings.
And strings the same length would be one such.
Your function would probably crash.

Click to expand...

Click to expand...

Why do you think that? I don't see any undefined behavior if the
strings are well formed.

Click to expand...

Look at the `for' loop and think about mycomparison("Foo","Foo").

- Show quoted text -

You also have to worry about either of the input strings being NULL.

Ben Bacarisse · Jan 16, 2009

Malcolm McLean said:
The loop charges past the nuls if the two strings are equal.

Yes, I pointed that out hours ago. I commented because it seemed odd,
after the bug had been identified, to suggest that testing with equal
length strings would have shown it up.

CBFalconer · Jan 17, 2009

C. J. Clegg said:
I have two null-terminated character strings, str1 and str2.
I don't know which one is longer, but I want to know if the
shortest one (whichever one that might be) matches the first
part of the longer one.

I can say:

if ( strncmp( str1, str2, strlen( str2 ) ) == 0 ||
strncmp( str2, str1, strlen( str1 ) ) == 0 )

but that seems a bit kludgy and inelegant.

Is there a better way?

Untested code follows:

/* return 1 for match over length of shorter string */
/* neither s1 and s2 can be NULL */
int qmatch(const char *s1, const char *s2) {
while ((*s1 == *s2) && *s1) s1++, s2++;
if (!*s1 || !s2) return 1;
return 0;
}

Antoninus Twink · Jan 17, 2009

int qmatch(const char *s1, const char *s2) {
while ((*s1 == *s2) && *s1) s1++, s2++;
if (!*s1 || !s2) return 1;
return 0;
}

You missed a trick there - you could make it even harder to read by
replacing the last two lines by
return !*s1||!*s2;

(By the way, the !s2 in your original code can't be right: it's
impossible for s2 to be null at that point. Maybe you'd have spotted
that if your main concern hadn't been to make the compress the code as
much as possible.)

Bartc · Jan 17, 2009

CBFalconer said:
Untested code follows:

.....

Why is everyone keen on offering untested bits of code when all the OP wants
is a better way of using strncmp()?

Richard · Jan 17, 2009

Malcolm McLean said:
I checked through that code and spotted the missing *, but not the
logic error.

It goes to show, use array notation rather than travelling pointers,
where possible. It's much less easy to go wrong.

I often wonder if any of the c.l.c regs bother running their code
through a debugger. Just by watching the local change you SEE these
errors. But no. Because Kernighan once said (wrongly) how its so much
easier to do it right the first time than hunt errors later they spurn
debuggers! Hell, Dicky discovered a bug in 50000 lines of code just by
reading the source! gdb needs discontinuing!

Richard · Jan 17, 2009

Bartc said:
....

Why is everyone keen on offering untested bits of code when all the OP
wants is a better way of using strncmp()?

And both "regs" posting untested incorrect code. I find it worrying.

Richard · Jan 17, 2009

Malcolm McLean said:
I don't use debuggers much. The problem is that I write code for a
wide variety of different platforms, some of which, being parallel,
are inherently hard to provide debugging tools for. If you use a
debgger routinely then that becomes the way of working. It is then
hard to switch to something else - a bit like moving between automatic
and manual cars. So I prefer to debug with diagnostic printfs.

Diagnostic printfs are amateurish and a cause of bugs in my opinion. It
might seem harsh but the very nature of writing them, compiling them in,
reading the logs and removing them is a cause of error and is waste of
time and effort. If there is NO debugger for your system then fine. But
gdb works with most if not all (well, obviously not all....).

But its not just me of course. Richard Stallman wrote uses gdb for a
reason.

Here:

Why Not Use Printf?
===================
http://dirac.org/linux/gdb/01-Introduction.php#whynotuse<tt>printf()</tt>

I know I go on about it, but I have rarely, if ever, seen anyone debug a
system with printf quicker than learning how to use a proper debugger
properly. It is simply ludicrous NOT to use one in this day and age on
large code bases where HW breakpoints and complicated watch points make
it trivial to catch complex state which trigger bugs.

When you catch a bug you can move up and down the stack to see the
variables which contributed to that bug. No wads of printf logs to wade
though. You can often rewind the code. You can change locals to trigger
the bug etc etc etc. Only people who do NOT know what a debugger is or
know how to use one properly argue against these benefits.

But we've discussed this a million times before. If someone wants to
blinker themselves and NOT learn how to use something as potentially
powerful as GDB then that's their decision. A silly one.

Ben Bacarisse · Jan 17, 2009

Bartc said:
....

Why is everyone keen on offering untested bits of code when all the OP
wants is a better way of using strncmp()?

Not everyone. My code was tested. I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

...
size_t l1 = strlen(str1), l2 = strlen(str2);
return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

Yes, it does not use strncmp but you could use it in place of memcmp
with very similar performance. memcmp might be a shade faster.

Ben Bacarisse · Jan 17, 2009

Richard said:
And both "regs" posting untested incorrect code. I find it worrying.

So Malcolm and I are not "regs" then. Oh well... I find it worrying
that you rarely post helpful code. No once could estimate your error
rate because the sample is so small.

Richard · Jan 17, 2009

Ben Bacarisse said:
So Malcolm and I are not "regs" then. Oh well... I find it worrying
that you rarely post helpful code. No once could estimate your error
rate because the sample is so small.

Both regs meant the two who posted faulty code. You are not a problem in
c.l.c. IMO.

I have posted code many times. Often with errors when I too was too lazy
to test it. Generally not.

I would only post code if I thought it made a difference to solutions
already offered. e.g the function pointer code where I did write it from
scratch purely for Bill AND tested it. Fully compilable Code is not
always the solution - snippets or suggestions to make the poster think
is often better.

I do not plead innocence or godlike powers of perfection.

Richard · Jan 17, 2009

Ben Bacarisse said:
Not everyone. My code was tested. I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

...
size_t l1 = strlen(str1), l2 = strlen(str2);
return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

Yes, it does not use strncmp but you could use it in place of memcmp
with very similar performance. memcmp might be a shade faster.

And your views on two strlen operations on potentially massive buffers?

Ben Bacarisse · Jan 17, 2009

Malcolm McLean said:
Yours invades the reserved namespace.

Oh drat. I am sure you are right (I can't find my post at the moment)
but I intended to call it is_prefix. Sorry.

There is also the more philosphical question of whether two empty
string are prefixes to each other.

.... or indeed one empty string to any other string. The mathematician
in me says yes to both.

Ben Bacarisse · Jan 17, 2009

I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

...
size_t l1 = strlen(str1), l2 = strlen(str2);
return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

Yes, it does not use strncmp but you could use it in place of memcmp
with very similar performance. memcmp might be a shade faster.

And your views on two strlen operations on potentially massive
buffers?[/QUOTE]

That can't be serious question after I wrote taking about the pros and
cons of using two strlens vs. a hand-rolled version.

I am not sure there is a way to write it using only library functions so
that you never look further than the shorter of the two strings. I am
in rush at the moment, but nothing occurs to me.

Keith Thompson · Jan 17, 2009

Ben Bacarisse said:
Not everyone. My code was tested. I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

...
size_t l1 = strlen(str1), l2 = strlen(str2);
return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

Yes, it does not use strncmp but you could use it in place of memcmp
with very similar performance. memcmp might be a shade faster.

There may be an opportunity for a performance improvement here.
Calling strlen() on both strings means scanning both strings for their
entire lengths. If one string is much longer than the other, this is
wasteful; scanning both strings in parallel and quitting when reaching
the '\0' terminator for either one would be more efficient, letting
you ignore most of the longer string.

On the other hand, this won't help much if having one string much
longer than the other is a rare case; also, strlen() might take
advantage of low-level optimizations that aren't available in user
code.

Ben Bacarisse · Jan 17, 2009

Keith Thompson said:
There may be an opportunity for a performance improvement here.
Calling strlen() on both strings means scanning both strings for their
entire lengths. If one string is much longer than the other, this is
wasteful; scanning both strings in parallel and quitting when reaching
the '\0' terminator for either one would be more efficient, letting
you ignore most of the longer string.

On the other hand, this won't help much if having one string much
longer than the other is a rare case; also, strlen() might take
advantage of low-level optimizations that aren't available in user
code.

Agreed. So much so that I said much the same when replying to vipstar
although I am sure you've said it more clearly.

Keith Thompson · Jan 17, 2009

Ben Bacarisse said:
Agreed. So much so that I said much the same when replying to vipstar
although I am sure you've said it more clearly.

Ah, I was wondering if someone else had made the same point.
It's hard to keep track of everything that's posted here.

Bartc · Jan 17, 2009

Ben said:
Not everyone. My code was tested. I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

I was exaggerating a little with 'everyone'.

return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

And yes memcmp is another way to go, probably better as just an equal/not
equal is needed.

Ben Bacarisse · Jan 18, 2009

Bartc said:
Ben Bacarisse wrote:

And yes memcmp is another way to go, probably better as just an
equal/not equal is needed.

memcmp gives you same three-way result as strncmp. The reason I
suggest using it that there is no need to stop at the first null
(something strncmp must check for) since we pass the minimum of the
two lengths. That suggests it may be shade faster.

String constants	8	Jul 31, 2006
Reversing order of words in a given string	2	Apr 26, 2007
strcmp/strncmp/strnicmp	1	Mar 29, 2005
Can someone tell me what's wrong with this question on StackOverflow?	0	Aug 19, 2023
regexp or array?	10	Feb 18, 2009
Fixing escaped characters python-xbee	4	Apr 24, 2013
Void pointers	8	Jun 21, 2009
Pascal - C (2)	57	Nov 1, 2008

Question on strncmp / strnicmp use

Richard

Fred

Ben Bacarisse

CBFalconer

Antoninus Twink

Bartc

Richard

Richard

Richard

Ben Bacarisse

Ben Bacarisse

Richard

Richard

Ben Bacarisse

Ben Bacarisse

Keith Thompson

Ben Bacarisse

Keith Thompson

Bartc

Ben Bacarisse

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads