Mystery: static variables & performance

C

CBFalconer

nrk said:
.... snip ...

PS: Your book, section 4.2: The WAR style example of strcmp is
atrocious IMO. If you must insist on a single return, here's a
clearer version of strcmp:

int strcmp(const char *s1, const char *s2) {
while ( *s1 == *s2 && *s1 )
++s1, ++s2;

return *s1 - *s2;
}

if (sizeof int == 1) that code has an incipient bug.
 
R

R. Rajesh Jeba Anbiah

nrk said:
Rajesh, it is unfortunate that inspite of spending around 4 years in
CLC, you haven't realized why the regulars are sticklers for topicality and
to questions related to the ANSIC C Standard. It is not because C is a toy
language. It is not because all (or even most) regulars here are old
people who don't want to dig any more. It simply has to do with the fact
that C is a very versatile and pervasive language. If one were to start
answering any and all questions merely on the basis that it involved C in
some way, the group will become well near useless. As they say, you can
either be a rather poor jack of all trades or be a good master of one.
This group is a good master at answering questions related to the C
language as it is described in the ANSI/ISO standards. By being so picky
about topicality, CLC actually does a favor to off-topic posters. By
forcing them to go to a domain that's richer with experts in their specific
issues it helps them get better quality help than they would get here.


The answers are right in front of your eyes. Don't wonder why you're not
able to see them as long as you have your eyes shut.

Respected Mr.Ram Kumar,

Thanks for your reply. But, it seems that you didn't understand
what I meant or vice-versa. As you know, we have already received
these explanations before, but this is not what I meant as the
valid/rational explanation. Perhaps I should rephrase my slugish
English with better one---which I don't know now. Anyway, thanks a lot
for your concerns.


said:
PS: Your book, section 4.2: The WAR style example of strcmp is atrocious
IMO. If you must insist on a single return, here's a clearer version of
strcmp:

int strcmp(const char *s1, const char *s2) {
while ( *s1 == *s2 && *s1 )
++s1, ++s2;

return *s1 - *s2;
}

Thanks a lot for your interest in the quality of the book. As you
see, it has it's bug reporting corner; please don't take c.l.c as the
one. Thanks for your help; thanks for your understanding.

With lots & lots of wishes.

--
"I don't believe in the God who doesn't give me food, but shows me
heaven!" -- Swami Vivekanandha
If you live in USA, please support John Edwards.
http://guideme.itgo.com/atozofc/ - "A to Z of C" Project
Email: rrjanbiah-at-Y!com
 
R

Richard Heathfield

R. Rajesh Jeba Anbiah said:
Thanks a lot for your interest in the quality of the book. As you
see, it has it's bug reporting corner; please don't take c.l.c as the
one. Thanks for your help; thanks for your understanding.

Since he /did/ post his "clearer version" to comp.lang.c, you should at
least get some feedback as to what is wrong with his correction. Do you see
the flaw? If not, then how do can you do quality control on the bug reports
you receive?

comp.lang.c is good at this sort of thing.

(Hint: the problem I can see has nothing to do with the loop.)
 
P

pete

Richard said:
Since he /did/ post his "clearer version" to comp.lang.c, you should at
least get some feedback as to what is wrong with his correction. Do you see
the flaw? If not, then how do can you do quality control on the bug reports
you receive?

comp.lang.c is good at this sort of thing.

(Hint: the problem I can see has nothing to do with the loop.)

Udefined behavior from integer overflow ?
 
P

Peter Nilsson

pete said:
Udefined behavior from integer overflow ?

That's one. But there is also the fact that it does not make comparisons
based on unsigned char values. [A requirement for the real strcmp().]
 
N

nrk

R. Rajesh Jeba Anbiah wrote:

Respected Mr.Ram Kumar,

Please feel free to address me as just ram or nrk.
Thanks for your reply. But, it seems that you didn't understand
what I meant or vice-versa. As you know, we have already received
these explanations before, but this is not what I meant as the
valid/rational explanation. Perhaps I should rephrase my slugish
English with better one---which I don't know now. Anyway, thanks a lot
for your concerns.

Very well. I guess we have to just agree to disagree then.
Thanks a lot for your interest in the quality of the book. As you
see, it has it's bug reporting corner; please don't take c.l.c as the
one. Thanks for your help; thanks for your understanding.

Apologies. I will send future reports through your web form. FWIW, this
seems to be a fine effort, although lacking a bit in polish (nothing that
can't be fixed by having enough eyes go through it).

-nrk.
 
N

nrk

Richard said:
Since he /did/ post his "clearer version" to comp.lang.c, you should at
least get some feedback as to what is wrong with his correction. Do you
see the flaw? If not, then how do can you do quality control on the bug
reports you receive?

comp.lang.c is good at this sort of thing.

(Hint: the problem I can see has nothing to do with the loop.)

Ok, here's my take on this:

a) sizeof(int) > 1 for hosted implementations.
http://www.google.com/[email protected]
So, integer overflow not an issue, yes?

b) Peter's concern still remains. So, does changing the last line to:

return *(unsigned char *)s1 - *(unsigned char *)s2;

make it alright?

-nrk.
 
C

CBFalconer

Richard said:
Since he /did/ post his "clearer version" to comp.lang.c, you
should at least get some feedback as to what is wrong with his
correction. Do you see the flaw? If not, then how do can you do
quality control on the bug reports you receive?

comp.lang.c is good at this sort of thing.

(Hint: the problem I can see has nothing to do with the loop.)

Nobody has picked up on the real problem when sizeof char == 1,
which is in the final comparison and can overflow. It should be:

return (*s1 > *s2) - (*s1 < *s2);

or the equivalent.
 
R

Richard Heathfield

nrk said:
Ok, here's my take on this:

a) sizeof(int) > 1 for hosted implementations.
http://www.google.com/[email protected]
So, integer overflow not an issue, yes?

b) Peter's concern still remains. So, does changing the last line to:

return *(unsigned char *)s1 - *(unsigned char *)s2;

make it alright?

Chuck's solution is a good one (although I'm not sure why he insists that
it's only relevant when sizeof(char) is 1, since sizeof(char) is /always/
1!).

But my point was more general than that; that is, comp.lang.c is very good
at chewing over proposed "corrections", making sure that they do actually
improve the code. Corrections you receive in email will not have undergone
that process of review, so how do you know whether you've spotted all the
problems therein?
 
C

CBFalconer

Richard said:
.... snip ...

Chuck's solution is a good one (although I'm not sure why he
insists that it's only relevant when sizeof(char) is 1, since
sizeof(char) is /always/ 1!).

Oh very well, said Tom sheepishly. I meant int. Bah humbug.
 
P

Peter Nilsson

nrk said:
Ok, here's my take on this:

a) sizeof(int) > 1 for hosted implementations.
http://www.google.com/[email protected]

There mere fact that Dan Pop says so does not make it so! ;)

There are actual members of the C Committee who disagree on this.
So, integer overflow not an issue, yes?

No. Even if sizeof(int) == 2, you can still have INT_MAX < UCHAR_MAX. [It's
the limits which are important, not the byte size.]
b) Peter's concern still remains. So, does changing the last line to:

return *(unsigned char *)s1 - *(unsigned char *)s2;

make it alright?

No. Reading chars via an unsigned char lvalue can produce a different value
to the original.

Since character constants and I/O are based on unsigned char -> int -> char
_conversions_ when storing plain char strings, the correct answer (assuming
no integer overflow) is to use a _conversion_ of the plain char value to
unsigned char...

return (unsigned char) *s1 - (unsigned char) *s2;

Note that on most implementations (8-bit, 2c, no padding) there is no need
to go to this extreme, although the result is the same.

The most robust answer would seem to be...

return (unsigned char) *s1 > (unsigned char) *s2
- (unsigned char) *s1 < (unsigned char) *s2;

or...

return (unsigned char) *s1 < (unsigned char) *s2 ? -1
: (unsigned char) *s1 > (unsigned char) *s2;
 
C

CBFalconer

Peter said:
.... snip ...

The most robust answer would seem to be...

return (unsigned char) *s1 > (unsigned char) *s2
- (unsigned char) *s1 < (unsigned char) *s2;

or...

return (unsigned char) *s1 < (unsigned char) *s2 ? -1
: (unsigned char) *s1 > (unsigned char) *s2;

Why cast to unsigned char? If native chars are signed, I would
want this routine to respect that. Casts are usually a sign of
evil doings.
 
P

Peter Nilsson

CBFalconer said:
Peter Nilsson wrote: ....

Oops...

return ((unsigned char) *s1 > (unsigned char) *s2)
- ((unsigned char) *s1 < (unsigned char) *s2);
Why cast to unsigned char?

Because that is the specification for strcmp().
If native chars are signed, I would
want this routine to respect that.

If we weren't talking about strcmp, you would be free to do that. But you
may get some unexpected surprises, e.g. "a" > "aé".
Casts are usually a sign of evil doings.

You can do the same thing without casts, if you like.
 
R

R. Rajesh Jeba Anbiah

<snip>

Ram, sorry for my late follow up; I was not feeling well for past
2days. Now I see, the thread has some more useful info. Thanks.
 
C

CBFalconer

Peter said:
Because that is the specification for strcmp().


If we weren't talking about strcmp, you would be free to do that.
But you may get some unexpected surprises, e.g. "a" > "aé".

I still see no justification for the cast. I know of nothing that
specifies that strings consist of unsigned chars. However that is
a good argument for having the system specify char to be
unsigned. From N869:

7.21.4.2 The strcmp function

Synopsis
[#1]
#include <string.h>
int strcmp(const char *s1, const char *s2);

Description

[#2] The strcmp function compares the string pointed to by
s1 to the string pointed to by s2.

Returns

[#3] The strcmp function returns an integer greater than,
equal to, or less than zero, accordingly as the string
pointed to by s1 is greater than, equal to, or less than the
string pointed to by s2.
 
A

Arthur J. O'Dwyer

I still see no justification for the cast.

Look about two subsections earlier in N869, where it discusses the
semantics of comparison functions:

7.21.4 Comparison functions

[#1] The sign of a nonzero value returned by the comparison
functions memcmp, strcmp, and strncmp is determined by the
sign of the difference between the values of the first pair
of characters (both interpreted as unsigned char) that
differ in the objects being compared.

IMHO this is a silly requirement; I would *expect* memcmp to do
unsigned comparisons and 'strcmp' to do plain char comparisons,
but for whatever reason the C committee decided otherwise.

HTH,
-Arthur
 
J

Joe Wright

CBFalconer said:
Peter said:
Because that is the specification for strcmp().


If we weren't talking about strcmp, you would be free to do that.
But you may get some unexpected surprises, e.g. "a" > "aé".

I still see no justification for the cast. I know of nothing that
specifies that strings consist of unsigned chars. However that is
a good argument for having the system specify char to be
unsigned. From N869:

7.21.4.2 The strcmp function

Synopsis
[#1]
#include <string.h>
int strcmp(const char *s1, const char *s2);

Description

[#2] The strcmp function compares the string pointed to by
s1 to the string pointed to by s2.

Returns

[#3] The strcmp function returns an integer greater than,
equal to, or less than zero, accordingly as the string
pointed to by s1 is greater than, equal to, or less than the
string pointed to by s2.
Don't we have a guarantee that characters in our set are positive? With
signed char, in the range 00..127 (ASCII)? I've read that EBCDIC
implementations, because characters can be > 127 implement unsigned char
just so that characters remain positive.

If this is the case, subtracting one positive integer from another
cannot overflow.
 
C

CBFalconer

Arthur J. O'Dwyer said:
I still see no justification for the cast.

Look about two subsections earlier in N869, where it discusses the
semantics of comparison functions:

7.21.4 Comparison functions

[#1] The sign of a nonzero value returned by the comparison
functions memcmp, strcmp, and strncmp is determined by the
sign of the difference between the values of the first pair
of characters (both interpreted as unsigned char) that
differ in the objects being compared.

IMHO this is a silly requirement; I would *expect* memcmp to do
unsigned comparisons and 'strcmp' to do plain char comparisons,
but for whatever reason the C committee decided otherwise.

Aha - that justifies Peter Nilssons attitude, and shoots down
mine. It does ensure that the shorter substring compares as less
than the longer.

It would be nice if such encompassing clauses were referenced in
the individual descriptions, i.e. "See also 7.21.4".
 
B

Ben Pfaff

pete said:
Peter said:
it does not make comparisons based on unsigned char values.
[A requirement for the real strcmp().]

I don't see that in the standard.

7.21.4 Comparison functions

1 The sign of a nonzero value returned by the comparison
functions memcmp, strcmp, and strncmp is determined by the
sign of the difference between the values of the first pair
of characters (both interpreted as unsigned char) that
differ in the objects being compared.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,147
Latest member
CarenSchni
Top