Question on strncmp / strnicmp use

R

Richard

Barry Schwarz said:
Why do you think that? I don't see any undefined behavior if the
strings are well formed.

I neglected to add, thinking it was obvious and I was wrong as it isn't
so obvious, "same length and same strings". The comparison loop would go past
the nul.
 
F

Fred

Barry said:
(e-mail address removed) writes:
 [...]
int mycomparison(const char *s1, const char *s2) {
    for(; *s1 == *s2; s1++, s2++);
    if(*s1 != *s2 && (*s1 == 0 || *s2 == 0)) return 0;
    if((unsigned char)*s1 < *s2) return -1; else return (unsigned char)
*s1 > *s2;
}
In C there are a few basic tests one does on each and every function
involving strings.
And strings the same length would be one such.
Your function would probably crash.
Why do you think that?  I don't see any undefined behavior if the
strings are well formed.

     Look at the `for' loop and think about mycomparison("Foo","Foo").

- Show quoted text -

You also have to worry about either of the input strings being NULL.
 
B

Ben Bacarisse

Malcolm McLean said:
The loop charges past the nuls if the two strings are equal.

Yes, I pointed that out hours ago. I commented because it seemed odd,
after the bug had been identified, to suggest that testing with equal
length strings would have shown it up.
 
C

CBFalconer

C. J. Clegg said:
I have two null-terminated character strings, str1 and str2.
I don't know which one is longer, but I want to know if the
shortest one (whichever one that might be) matches the first
part of the longer one.

I can say:

if ( strncmp( str1, str2, strlen( str2 ) ) == 0 ||
strncmp( str2, str1, strlen( str1 ) ) == 0 )

but that seems a bit kludgy and inelegant.

Is there a better way?

Untested code follows:

/* return 1 for match over length of shorter string */
/* neither s1 and s2 can be NULL */
int qmatch(const char *s1, const char *s2) {
while ((*s1 == *s2) && *s1) s1++, s2++;
if (!*s1 || !s2) return 1;
return 0;
}
 
A

Antoninus Twink

int qmatch(const char *s1, const char *s2) {
while ((*s1 == *s2) && *s1) s1++, s2++;
if (!*s1 || !s2) return 1;
return 0;
}

You missed a trick there - you could make it even harder to read by
replacing the last two lines by
return !*s1||!*s2;

(By the way, the !s2 in your original code can't be right: it's
impossible for s2 to be null at that point. Maybe you'd have spotted
that if your main concern hadn't been to make the compress the code as
much as possible.)
 
R

Richard

Malcolm McLean said:
I checked through that code and spotted the missing *, but not the
logic error.

It goes to show, use array notation rather than travelling pointers,
where possible. It's much less easy to go wrong.

I often wonder if any of the c.l.c regs bother running their code
through a debugger. Just by watching the local change you SEE these
errors. But no. Because Kernighan once said (wrongly) how its so much
easier to do it right the first time than hunt errors later they spurn
debuggers! Hell, Dicky discovered a bug in 50000 lines of code just by
reading the source! gdb needs discontinuing!
 
R

Richard

Bartc said:
....

Why is everyone keen on offering untested bits of code when all the OP
wants is a better way of using strncmp()?

And both "regs" posting untested incorrect code. I find it worrying.
 
R

Richard

Malcolm McLean said:
I don't use debuggers much. The problem is that I write code for a
wide variety of different platforms, some of which, being parallel,
are inherently hard to provide debugging tools for. If you use a
debgger routinely then that becomes the way of working. It is then
hard to switch to something else - a bit like moving between automatic
and manual cars. So I prefer to debug with diagnostic printfs.

Diagnostic printfs are amateurish and a cause of bugs in my opinion. It
might seem harsh but the very nature of writing them, compiling them in,
reading the logs and removing them is a cause of error and is waste of
time and effort. If there is NO debugger for your system then fine. But
gdb works with most if not all (well, obviously not all....).

But its not just me of course. Richard Stallman wrote uses gdb for a
reason.

Here:

Why Not Use Printf?
===================
http://dirac.org/linux/gdb/01-Introduction.php#whynotuse<tt>printf()</tt>

I know I go on about it, but I have rarely, if ever, seen anyone debug a
system with printf quicker than learning how to use a proper debugger
properly. It is simply ludicrous NOT to use one in this day and age on
large code bases where HW breakpoints and complicated watch points make
it trivial to catch complex state which trigger bugs.

When you catch a bug you can move up and down the stack to see the
variables which contributed to that bug. No wads of printf logs to wade
though. You can often rewind the code. You can change locals to trigger
the bug etc etc etc. Only people who do NOT know what a debugger is or
know how to use one properly argue against these benefits.

But we've discussed this a million times before. If someone wants to
blinker themselves and NOT learn how to use something as potentially
powerful as GDB then that's their decision. A silly one.
 
B

Ben Bacarisse

Bartc said:
....

Why is everyone keen on offering untested bits of code when all the OP
wants is a better way of using strncmp()?

Not everyone. My code was tested. I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

...
size_t l1 = strlen(str1), l2 = strlen(str2);
return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

Yes, it does not use strncmp but you could use it in place of memcmp
with very similar performance. memcmp might be a shade faster.
 
B

Ben Bacarisse

Richard said:
And both "regs" posting untested incorrect code. I find it worrying.

So Malcolm and I are not "regs" then. Oh well... I find it worrying
that you rarely post helpful code. No once could estimate your error
rate because the sample is so small.
 
R

Richard

Ben Bacarisse said:
So Malcolm and I are not "regs" then. Oh well... I find it worrying
that you rarely post helpful code. No once could estimate your error
rate because the sample is so small.

Both regs meant the two who posted faulty code. You are not a problem in
c.l.c. IMO.

I have posted code many times. Often with errors when I too was too lazy
to test it. Generally not.

I would only post code if I thought it made a difference to solutions
already offered. e.g the function pointer code where I did write it from
scratch purely for Bill AND tested it. Fully compilable Code is not
always the solution - snippets or suggestions to make the poster think
is often better.

I do not plead innocence or godlike powers of perfection.
 
R

Richard

Ben Bacarisse said:
Not everyone. My code was tested. I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

...
size_t l1 = strlen(str1), l2 = strlen(str2);
return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

Yes, it does not use strncmp but you could use it in place of memcmp
with very similar performance. memcmp might be a shade faster.

And your views on two strlen operations on potentially massive buffers?
 
B

Ben Bacarisse

Malcolm McLean said:
Yours invades the reserved namespace.

Oh drat. I am sure you are right (I can't find my post at the moment)
but I intended to call it is_prefix. Sorry.
There is also the more philosphical question of whether two empty
string are prefixes to each other.

.... or indeed one empty string to any other string. The mathematician
in me says yes to both.
 
B

Ben Bacarisse

I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

...
size_t l1 = strlen(str1), l2 = strlen(str2);
return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

Yes, it does not use strncmp but you could use it in place of memcmp
with very similar performance. memcmp might be a shade faster.

And your views on two strlen operations on potentially massive
buffers?[/QUOTE]

That can't be serious question after I wrote taking about the pros and
cons of using two strlens vs. a hand-rolled version.

I am not sure there is a way to write it using only library functions so
that you never look further than the shorter of the two strings. I am
in rush at the moment, but nothing occurs to me.
 
K

Keith Thompson

Ben Bacarisse said:
Not everyone. My code was tested. I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

...
size_t l1 = strlen(str1), l2 = strlen(str2);
return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

Yes, it does not use strncmp but you could use it in place of memcmp
with very similar performance. memcmp might be a shade faster.

There may be an opportunity for a performance improvement here.
Calling strlen() on both strings means scanning both strings for their
entire lengths. If one string is much longer than the other, this is
wasteful; scanning both strings in parallel and quitting when reaching
the '\0' terminator for either one would be more efficient, letting
you ignore most of the longer string.

On the other hand, this won't help much if having one string much
longer than the other is a rare case; also, strlen() might take
advantage of low-level optimizations that aren't available in user
code.
 
B

Ben Bacarisse

Keith Thompson said:
There may be an opportunity for a performance improvement here.
Calling strlen() on both strings means scanning both strings for their
entire lengths. If one string is much longer than the other, this is
wasteful; scanning both strings in parallel and quitting when reaching
the '\0' terminator for either one would be more efficient, letting
you ignore most of the longer string.

On the other hand, this won't help much if having one string much
longer than the other is a rare case; also, strlen() might take
advantage of low-level optimizations that aren't available in user
code.

Agreed. So much so that I said much the same when replying to vipstar
although I am sure you've said it more clearly.
 
K

Keith Thompson

Ben Bacarisse said:
Agreed. So much so that I said much the same when replying to vipstar
although I am sure you've said it more clearly.

Ah, I was wondering if someone else had made the same point.
It's hard to keep track of everything that's posted here.
 
B

Bartc

Ben said:
Not everyone. My code was tested. I thought the "better way of using
strncmp" was obvious from my reply to your message but if not I think
the "better strncmp" method is:

I was exaggerating a little with 'everyone'.
return memcmp(str1, str2, l1 > l2 ? l2 : l1) == 0;

And yes memcmp is another way to go, probably better as just an equal/not
equal is needed.
 
B

Ben Bacarisse

Bartc said:
Ben Bacarisse wrote:

And yes memcmp is another way to go, probably better as just an
equal/not equal is needed.

memcmp gives you same three-way result as strncmp. The reason I
suggest using it that there is no need to stop at the first null
(something strncmp must check for) since we pass the minimum of the
two lengths. That suggests it may be shade faster.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,773
Messages
2,569,594
Members
45,119
Latest member
IrmaNorcro
Top