Problem with gcc

J

jacob navia

When compiling with -Wall I always get the warning
../dictionary.c:156: warning: pointer targets in passing argument 1 to strcmp differ in signedness

I alwyays use unsigned chars for my text data, since there are no negative character, just character
codes. Problem is, strcmp expects chars, and gcc thinks that chars are signed by default, what is
all correct of course, but annoying.

Is there any way to convince it to avoid putting this warning?


Thanks
 
B

Ben Pfaff

jacob navia said:
./dictionary.c:156: warning: pointer targets in passing argument 1 to strcmp differ in signedness [...]
Is there any way to convince it to avoid putting this warning?

-Wno-pointer-sign
 
J

jacob navia

Ben Pfaff a écrit :
jacob navia said:
./dictionary.c:156: warning: pointer targets in passing argument 1 to strcmp differ in signedness [...]
Is there any way to convince it to avoid putting this warning?

-Wno-pointer-sign

OK, that worked, thanks
 
I

Ian Collins

jacob said:
When compiling with -Wall I always get the warning
../dictionary.c:156: warning: pointer targets in passing argument 1 to
strcmp differ in signedness

I alwyays use unsigned chars for my text data, since there are no
negative character, just character codes. Problem is, strcmp expects
chars, and gcc thinks that chars are signed by default, what is
all correct of course, but annoying.

How do you cope with string literals?
 
E

Eric Sosman

jacob said:
When compiling with -Wall I always get the warning
./dictionary.c:156: warning: pointer targets in passing argument 1 to
strcmp differ in signedness

I alwyays use unsigned chars for my text data, since there are no
negative character, just character codes. Problem is, strcmp expects
chars, and gcc thinks that chars are signed by default, what is
all correct of course, but annoying.

Is there any way to convince it to avoid putting this warning?

Use char.

"If you lie to the compiler, it will get its revenge."
-- Henry Spencer
 
J

jacob navia

Eric Sosman a écrit :
Use char.

"If you lie to the compiler, it will get its revenge."
-- Henry Spencer

I do not want to have negative characters!
They are NOT integers, that is why they are UNSIGNED.

If you limit yourself to ASCII, it could be OK, but I do not want just
ASCII.
 
I

Ian Collins

jacob said:
Eric Sosman a écrit :

I do not want to have negative characters!
They are NOT integers, that is why they are UNSIGNED.

If you limit yourself to ASCII, it could be OK, but I do not want just
ASCII.

So how do you handle string literals?

How do you manage warnings form other compilers? You are producing
non-portable code.
 
J

jacob navia

Richard Heathfield a écrit :
Um, signed or unsigned, characters are integers whether you like it or
not. Presumably you mean that you are thinking of them as text glyphs
rather than as integers (which is fine, by the way - lots of us do
that some or all of the time). I don't see why you think making them
unsigned stops them from being integers.

<snip>

They are integers, from zero to 2^CHAR_BIT.

I.e. they are non negative integer codes.

Since when manipulating bits I need to avoid sign extensions, I used in the
bit string package unsigned integers throughout. Converting signed chars into
unsigned integers can produce all kinds of nonsense.

I standardized into unsigned char throughout the container library. There
is NO system that I know of that would have a different pointer size
or characteristics for signed or unsigned chars!
 
A

Alan Curry

I do not want to have negative characters!

Can you give an example of something that doesn't work with plain char
specifically because some characters are negative? I think that can only
happen if you are making bad assumptions.
 
J

jacob navia

Alan Curry a écrit :
Can you give an example of something that doesn't work with plain char
specifically because some characters are negative? I think that can only
happen if you are making bad assumptions.

I assume characters are codes from one to 255. This is a bad assumption
maybe, in some kind of weird logic when you assign a sign to a character
code.

There is a well established confusion in C between characters (that are
encoded as integers) and integer VALUES.

One of the reasons is that we have "signed" and "unsigned" characters.

I prefer not to use any sign in the characters, and treat 152 as character
code 152 and not as -104. Stupid me, I know.

Besides, when I convert it into a bigger type, I would like to get
152, and not 4294967192.

Of course, when YOU see 4294967192 you think immediately:

Ahhh of course, that is character code 152 that got converted into an int, then casted
to unsigned and got that weird value...

Since size_t is unsigned, converting to unsigned is a fairly common operation.

Or when comparing, I get

warning: "comparison between signed and unsigned".

And MANY other bugs and stuff I do not want to get involved with. Writing software
is difficult enough without having to bother with the sign of characters or the
sex of angels, or the number of demons you can fit in a pin's head.
 
B

Ben Bacarisse

Can you give an example of something that doesn't work with plain char
specifically because some characters are negative? I think that can only
happen if you are making bad assumptions.

The most annoying is using the character class tests isxxxx.
Technically, a cast is needed to be portable:

char *cp = ...;
...
if (isdigit((unsigned char)*cp)) ...

So far, I have no found any implementation that does not handle this
correctly. So much code exists without the cast, that C libraries
that run on machines with signed char make sure that nothing bad
happens. Still, I think is an example that matches what you asked
for, though not a strong one since the solution is simple.

I don't like using unsigned char for plain strings since it suggests
other uses. As a result, I end up putting the cast in where it's
needed.
 
A

Alan Curry

Alan Curry a écrit :

I assume characters are codes from one to 255. This is a bad assumption
maybe, in some kind of weird logic when you assign a sign to a character
code.

So don't do that. If the values are relevant at all, you should be using
unsigned char explicitly, not plain char.
There is a well established confusion in C between characters (that are
encoded as integers) and integer VALUES.

Indeed, you can get confused if you rely too much on the fact that char is an
integer type.
Besides, when I convert it into a bigger type, I would like to get
152, and not 4294967192.

There's an easy answer for that: never convert plain char to a bigger type.

My rule on plain chars is that they should only be used for real characters,
which are things that are read from and/or written to a text stream. If your
char variable is not really a character (i.e. it didn't come from a text
stream and it will never be printed to a text stream) it should be declared
explicitly as signed or unsigned.

The standard library does add some confusion with the ctype.h functions that
work on characters as characters but require them to be unsigned. Don't look
in ctype.h for examples of good design.
Since size_t is unsigned, converting to unsigned is a fairly common operation.

How does a character value (which is charset-dependent anyway) become a size?
I can't see how that makes sense.
warning: "comparison between signed and unsigned".

I see a lot of those when compiling other people's code, and sometimes my own
too, and usually I fix it by changing whichever thing was signed to unsigned,
and this is usually an improvement.

I've done that so many times, it makes me think that perhaps C got the
default integer signedness wrong. If plain int, short, and long had all been
unsigned, with the "signed" keyword being required to declare signed
variables, there might be fewer problems.
 
S

Seebs

When compiling with -Wall I always get the warning
./dictionary.c:156: warning: pointer targets in passing argument 1 to strcmp differ in signedness

Yes, they do.
I alwyays use unsigned chars for my text data, since there are no negative character, just character
codes. Problem is, strcmp expects chars, and gcc thinks that chars are signed by default, what is
all correct of course, but annoying.
Yup.

Is there any way to convince it to avoid putting this warning?

Cast arguments to the type strcmp expects. Or use 'char' for text data,
since it is the native type for text data, and use 'unsigned char' when
you want to manipulate raw bits.

-s
 
S

Seebs

I do not want to have negative characters!

You won't on a machine where character values are never negative.
They are NOT integers, that is why they are UNSIGNED.

Apparently, they are not.
If you limit yourself to ASCII, it could be OK, but I do not want just
ASCII.

That's fine, plain char should be able to hold those values fine on any
system.

To be picky, BTW, even on a system where plain char is unsigned, unsigned
char and plain char are two different types.

-s
 
K

Keith Thompson

Seebs said:
You won't on a machine where character values are never negative.


Apparently, they are not.


That's fine, plain char should be able to hold those values fine on any
system.

Not necessarily. I've used plenty of systems that support, among
other character sets, Latin-1 (ISO 8859-1), which uses the full 8-bit
range from 0 to 255, but on which plain char is 8 bits and signed.
On such a system, with the right locale settings, this program:

#include <stdio.h>
int main(void)
{
const char *s = "This is a Yen sign: '\xa5'";
puts(s);
return 0;
}

will produce the expected output, even though puts takes an argument
of type "const char*", not "const unsigned char*".

The thing is, we tend to depend on this kind of thing to Just Work,
but I'd have to go through several sections of the standard to figure
out just what's guaranteed (I'll do that later).
To be picky, BTW, even on a system where plain char is unsigned, unsigned
char and plain char are two different types.

Yup.
 
N

Nick

jacob navia said:
Richard Heathfield a écrit :

They are integers, from zero to 2^CHAR_BIT.

I.e. they are non negative integer codes.

If you are viewing them as a collection of bytes, rather than as
strings, you really should be using memcmp instead of strcmp. Of
course, you need to know the length.

But C doesn't have a native "collection of positive integers from zero
to 2^CHAR_BIT terminated by a zero" function, so won't have native
functions to operate on them either.
 
N

Nick

Nick said:
If you are viewing them as a collection of bytes, rather than as
strings, you really should be using memcmp instead of strcmp. Of
course, you need to know the length.

But C doesn't have a native "collection of positive integers from zero
to 2^CHAR_BIT terminated by a zero" function, so won't have native
functions to operate on them either.

That first "function" there is wrong, and really the paragraph would be
better written as:

But C doesn't have the concept of "collection of positive integers from
zero to 2^CHAR_BIT terminated by a zero", so won't have functions to
operate on them either.
 
B

bartc

jacob navia said:
Alan Curry a écrit :

I assume characters are codes from one to 255. This is a bad assumption
maybe, in some kind of weird logic when you assign a sign to a character
code.

There is a well established confusion in C between characters (that are
encoded as integers) and integer VALUES.

One of the reasons is that we have "signed" and "unsigned" characters.

Yes, there should have been signed and unsigned byte. And a separate char
type equivalent to (or a synonym for) unsigned byte.

It really is exasperating when most people in this group insist that signed
character codes are perfectly normal and sensible!

Apparently chars are signed because on the PDP11 or some such machine,
sign-extending byte values was faster than zero-extending them. A bit
shortsighted. (If it had been the other way around, they would of course
have been singing the praises of unsigned char codes; except they would have
been justified this time..)
I prefer not to use any sign in the characters, and treat 152 as character
code 152 and not as -104. Stupid me, I know.

As I understand it, you can easily choose to use unsigned char type for such
codes. The problem being when passing these to library functions where char
is signed and this triggers a warning?
Besides, when I convert it into a bigger type, I would like to get
152, and not 4294967192.

Why doesn't widening a signed value into an unsigned one itself trigger a
warning?
 
J

John Kelly

The most annoying is using the character class tests isxxxx.
Technically, a cast is needed to be portable:

char *cp = ...;
...
if (isdigit((unsigned char)*cp)) ...


And if testing in a loop, you may want to cast separately from the test.
Like in this trim function:



static void
trim (char **ts)
{
unsigned char *exam;
unsigned char *keep;

exam = (unsigned char *) *ts;
while (*exam && isspace (*exam)) {
++exam;
}
*ts = (char *) exam;
if (!*exam) {
return;
}
keep = exam;
while (*++exam) {
if (!isspace (*exam)) {
keep = exam;
}
}
if (*++keep) {
*keep = '\0';
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top