A question about pointer of array.

F

Flash Gordon

James Kuyper wrote, On 21/12/08 18:49:
Anthony said:
Keith said:
"Anthony Fremont" <[email protected]> writes:
unsigned char Ptr[SIZE][MAXSTRLEN] =
{"HeArd","DiaMoNd","SprAde","AbC"}; [snip]
The disadvantage is that it allocates the maximum size for each
string. (And plain char would make more sense here than unsigned
char.)

Why would "plain" char make more sense? Leaving it unspecified leaves
it up to the implementation and means that I would need to cast it to
unsigned anyway. I don't get it.

Why would you need to convert to unsigned?

Because the values get passed to tolower which expects either EOF or a
positive value. Although in this case all the characters are in the
basic character set and so guaranteed to be positive whether char is the
same as signed or unsigned char, but one assumes that those are not
necessarily the strings to be used in the "real" program.
 
K

Keith Thompson

James Kuyper said:
Anthony said:
Keith said:
unsigned char Ptr[SIZE][MAXSTRLEN] =
{"HeArd","DiaMoNd","SprAde","AbC"}; [snip]
The disadvantage is that it allocates the maximum size for each
string. (And plain char would make more sense here than unsigned
char.)
Why would "plain" char make more sense? Leaving it unspecified
leaves it up to the implementation and means that I would need to
cast it to unsigned anyway. I don't get it.

Why would you need to convert to unsigned?

I think he meant to say that he would need to cast it to unsigned
char, which is advisible when passing a char value to tolower().
 
K

Keith Thompson

Flash Gordon said:
James Kuyper wrote, On 21/12/08 18:49:
Anthony said:
Keith Thompson wrote:

unsigned char Ptr[SIZE][MAXSTRLEN] =
{"HeArd","DiaMoNd","SprAde","AbC"}; [snip]
The disadvantage is that it allocates the maximum size for each
string. (And plain char would make more sense here than unsigned
char.)

Why would "plain" char make more sense? Leaving it unspecified leaves
it up to the implementation and means that I would need to cast it to
unsigned anyway. I don't get it.

Why would you need to convert to unsigned?

Because the values get passed to tolower which expects either EOF or a
positive value. Although in this case all the characters are in the
basic character set and so guaranteed to be positive whether char is the
same as signed or unsigned char, but one assumes that those are not
necessarily the strings to be used in the "real" program.

If you have a plain char value of -1, converting it to unsigned yields
UINT_MAX. Assuming UINT_MAX > UCHAR_MAX, passing this value to
tolower() invokes undefined behavior.

I suspect half the people in ths discussion have quitely assumed that
"convert to unsigned" was verbal shorthand for "convert to unsigned
char", and the other half assumed it meant "convert to unsigned int".
 
K

Keith Thompson

Anthony Fremont said:
What I find to be most baffling about this is that tolower(),
isupper() and the whole lot of them expect an int argument and
return an int. How casting the argument to unsigned char (but not
actually using unsigned char for the storage) will help, I guess I'm
too dumb to understand.

In general, the way to pass an argument, or obtain a result, that can
be either a character or the value EOF is to use a value of type int,
containing *either* a value within the range of unsigned char *or* the
value EOF (which is always negative, and typically -1). Since plain
char can be signed, -1 could be a valid character, so passing a plain
char value wouldn't let tolower() distinguish between that value and
EOF. The same scheme is used for the return value of fgetc(), getc(),
and getchar().

It's ugly but it works.
 
I

Ian Collins

Keith said:
James Kuyper said:
Anthony said:
Keith Thompson wrote:
unsigned char Ptr[SIZE][MAXSTRLEN] =
{"HeArd","DiaMoNd","SprAde","AbC"}; [snip]
The disadvantage is that it allocates the maximum size for each
string. (And plain char would make more sense here than unsigned
char.)
Why would "plain" char make more sense? Leaving it unspecified
leaves it up to the implementation and means that I would need to
cast it to unsigned anyway. I don't get it.
Why would you need to convert to unsigned?

I think he meant to say that he would need to cast it to unsigned
char, which is advisible when passing a char value to tolower().
Why? The parameter type is int, so casting char to unsigned char is
pointless.
 
I

Ian Collins

Ian said:
Keith said:
James Kuyper said:
Anthony Fremont wrote:
Keith Thompson wrote:
unsigned char Ptr[SIZE][MAXSTRLEN] =
{"HeArd","DiaMoNd","SprAde","AbC"}; [snip]
The disadvantage is that it allocates the maximum size for each
string. (And plain char would make more sense here than unsigned
char.)
Why would "plain" char make more sense? Leaving it unspecified
leaves it up to the implementation and means that I would need to
cast it to unsigned anyway. I don't get it.
Why would you need to convert to unsigned?
I think he meant to say that he would need to cast it to unsigned
char, which is advisible when passing a char value to tolower().
Why? The parameter type is int, so casting char to unsigned char is
pointless.
Never mind, I forgot sign extension.
 
J

James Kuyper

Flash said:
James Kuyper wrote, On 21/12/08 18:49:
Anthony said:
Keith Thompson wrote:
unsigned char Ptr[SIZE][MAXSTRLEN] =
{"HeArd","DiaMoNd","SprAde","AbC"}; [snip]
The disadvantage is that it allocates the maximum size for each
string. (And plain char would make more sense here than unsigned
char.)
Why would "plain" char make more sense? Leaving it unspecified leaves
it up to the implementation and means that I would need to cast it to
unsigned anyway. I don't get it.
Why would you need to convert to unsigned?

I sent out a 'cancel' notice shortly after sending that message.
Unfortunately, many news servers ignore 'cancel' notices.
 
J

James Kuyper

Anthony said:
K&R doesn't seem to be so unsure.

The C standard is quite certain about it's agnosticism: "If the program
attempts to modify such an array, the behavior is undefined." The
attempt to modify the array might succeed, in which case it's
non-constant. It might fail, in which case it's constant. It might cause
your program to abort, rendering the question moot. All of these options
(and many more) are allowed by the phrase "the behavior is undefined".

Since this is the same thing the standard says about attempting to
modify an array of char defined to be 'const', the practical effect is
the same, as far as the C standard is concerned.

....
One thing is for certain. The index of my K&R2 book has this to say about
it:
"string literal see string constant"
Then when you go to page 194 and read A2.6 String Literals (odd huh), you
find the first sentence says:
"A string literal, also called a string constant......"

Without you guys to set me straight, I'd assume that "constant" meant
exactly that.

Unfortunately, it does not - that's an example of how standardese can
lead you astray. There are real implementations where string "constants"
have been writable, and there's a moderately large amount of code
written that actually depends upon that fact. It's stupid code, in my
opinion, but my opinion doesn't make that code any less real. Such
implementations can be fully conforming. Such code can also be
conforming, though it is certainly not strictly conforming.
 
I

Ian Collins

Anthony said:
Ian Collins said:
Ian said:
Keith Thompson wrote:
Anthony Fremont wrote:
Keith Thompson wrote:
unsigned char Ptr[SIZE][MAXSTRLEN] =
{"HeArd","DiaMoNd","SprAde","AbC"}; [snip]
The disadvantage is that it allocates the maximum size for each
string. (And plain char would make more sense here than unsigned
char.)
Why would "plain" char make more sense? Leaving it unspecified
leaves it up to the implementation and means that I would need to
cast it to unsigned anyway. I don't get it.
Why would you need to convert to unsigned?
I think he meant to say that he would need to cast it to unsigned
char, which is advisible when passing a char value to tolower().

Why? The parameter type is int, so casting char to unsigned char is
pointless.
Never mind, I forgot sign extension.

Which is the whole reason I tend to just specify unsigned char for most
things I do. I don't want it occurring most of the time.
Which causes problems with the other set of library functions that use
(const) char* for their parameters!
 
B

Ben Bacarisse

Anthony Fremont said:
Ian Collins said:
Ian said:
Keith Thompson wrote:
Anthony Fremont wrote:
Keith Thompson wrote:
unsigned char Ptr[SIZE][MAXSTRLEN] =
{"HeArd","DiaMoNd","SprAde","AbC"}; [snip]
The disadvantage is that it allocates the maximum size for each
string. (And plain char would make more sense here than unsigned
char.)
Why would "plain" char make more sense? Leaving it unspecified
leaves it up to the implementation and means that I would need to
cast it to unsigned anyway. I don't get it.
Why would you need to convert to unsigned?
I think he meant to say that he would need to cast it to unsigned
char, which is advisible when passing a char value to tolower().

Why? The parameter type is int, so casting char to unsigned char is
pointless.
Never mind, I forgot sign extension.

Which is the whole reason I tend to just specify unsigned char for most
things I do. I don't want it occurring most of the time.

In case people are confused by this I think it is worth pointing out
that you are taking considerable licence using the term "sign
extension" like this (and depending on what he meant by having
forgotten about it, Ian might be as well).

You can't stop it happening in C by using (or casting to) unsigned
char. The term applies equally to extending a zero sign bit as it
does to a non-zero sign bit. By choosing to use an unsigned char you
control the sign extension (by controlling the sign) but you don't
stop it happening. At least that is how I look at it.

Of course, all this is a little woolly in many cases, because values
like -1 don't have a sign bit to extend. You only get a sign bit when
you put a value into an object and interpret that object as a signed
integer, but in the cases where the expression makes sense it applies
to both signs.
 
I

Ian Collins

Anthony said:
Please elaborate on some actual problems.

Compile this and count the warnings!

#include <stdio.h>
#include <strings.h>

int
main(void)
{
const size_t bufferSize = 64;

unsigned char buffer[bufferSize];

unsigned char one[] = "hello ";
unsigned char two[] = "world";

strcpy( buffer, one );

unsigned char* three = strcat( buffer, two );
}
 
I

Ian Collins

Richard said:
Ian Collins said:

Compile this and count the warnings!

#include <stdio.h>
#include <strings.h>

int
main(void)
{
const size_t bufferSize = 64;

unsigned char buffer[bufferSize];

unsigned char one[] = "hello ";
unsigned char two[] = "world";

strcpy( buffer, one );

unsigned char* three = strcat( buffer, two );
}

Just five (two of which can be ignored because they are legal C99):

foo.c:9: warning: ANSI C forbids variable-size array `buffer'
foo.c:14: warning: implicit declaration of function `strcpy'
foo.c:16: parse error before `unsigned'
foo.c:12: warning: unused variable `two'
foo.c:17: warning: control reaches end of non-void function

I think your program would have carried more weight if you'd got the
header right. :)
Ah well, it makes no difference on my platform...

"/tmp/x.c", line 14: warning: argument #1 is incompatible with prototype:
prototype: restrict pointer to char : "/usr/include/iso/string_iso.h",
line 83
argument : pointer to unsigned char
"/tmp/x.c", line 14: warning: argument #2 is incompatible with prototype:
prototype: restrict pointer to const char :
"/usr/include/iso/string_iso.h", line 83
argument : pointer to unsigned char
"/tmp/x.c", line 16: warning: argument #1 is incompatible with prototype:
prototype: restrict pointer to char : "/usr/include/iso/string_iso.h",
line 81
argument : pointer to unsigned char
"/tmp/x.c", line 16: warning: argument #2 is incompatible with prototype:
prototype: restrict pointer to const char :
"/usr/include/iso/string_iso.h", line 81
argument : pointer to unsigned char
"/tmp/x.c", line 16: warning: assignment type mismatch:
pointer to unsigned char "=" pointer to char
 
R

Richard Tobin

You can't stop it happening in C by using (or casting to) unsigned
char. The term applies equally to extending a zero sign bit as it
does to a non-zero sign bit.

That last sentence is certainly true...
By choosing to use an unsigned char you
control the sign extension (by controlling the sign) but you don't
stop it happening.

But this is an odd way of looking at it. An unsigned value has no
sign - as the term suggests - so no sign extension happens. The high
order bit of a twos-complement number represents the sign, and that
bit is "extended" into all the new high-order bits when the value is
converted to a longer format. The high-order bit of an unsigned
number does not represent the sign, and the new high-order bits
are set to zero - no bit is "extended".

-- Richard
 
K

Keith Thompson

Anthony Fremont said:
K&R doesn't seem to be so unsure.


One thing is for certain. The index of my K&R2 book has this to say about
it:
"string literal see string constant"
Then when you go to page 194 and read A2.6 String Literals (odd huh), you
find the first sentence says:
"A string literal, also called a string constant......"

Without you guys to set me straight, I'd assume that "constant" meant
exactly that.

The term "constant" in C is really a syntactic term. For example,
42 is an integer constant, 'x' is a character constant, 1.2 is a
floating-point constant, and (2 + 2) is a constant expression.
(Note that "constant" and "const" mean two quite different things.)

It's reasonable to think of a string literal as belonging to the same
syntactic category as integer, character, and floating-point
constants, namely "constants" (however the standard, unlike K&R,
doesn't refer to string literals as string constants).

What Jack Klein meant by "Perhaps they are constant, perhaps they are
not." was something different. An implementation may or may not
store the character array specified by a string literal in a way that
prevents it from being modified. Attempting to modify such an array
is undefined behavior, not because the array is "const" (it isn't, for
historical reasons), but because the standard says so -- and intent of
the standard saying so is to allow implementations to store such
arrays in read-only memory. (In an embedded system, it might be
stored in actual ROM; in a non-embedded system, it might be stored in
a virtual memory page that the OS prevents you from modifying.)

A concrete example:

int main(void)
{
char *s = "hello";
s[0] = 'J';
return 0;
}

The assignment to s[0] might attempt to modify read-only memory, with
the likely result of crashing the program (I got a segmentation
fault). Or it might silently fail to do anything, leaving s pointing
to the string "hello". Or it might actually modify the array, leaving
s pointing to the string "Jello". Or it might make demons fly out of
your nose (that's not really possible, but if it happened it wouldn't
make the implementation non-conforming).
 
F

Flash Gordon

Anthony Fremont wrote, On 22/12/08 13:14:
Others have pointed out (I think) the above should be string.h
int
main(void)
{
const size_t bufferSize = 64;

unsigned char buffer[bufferSize];

unsigned char one[] = "hello ";
unsigned char two[] = "world";

strcpy( buffer, one );

unsigned char* three = strcat( buffer, two );
}

So by "problems" you actually mean warnings.

One compilers warning is another compilers error. The C standard only
refers to "diagnostic messages" and leaves it up to authors of
implementations whether they produce fatal errors or not. The above code
*requires* at least one diagnostic (if you replace strings.h with
string.h), which is also all that is required if you feed this post to a
C compiler.
The code I posted had no
warnings. Don't think that I always use unsigned char and ignore warnings,
because that is not how I do things. I try to get rid of all warnings
without using casts. No casts and no warnings, yet my code is wrong.....

So how are you going to get rid of the required diagnostics in the above
code without casts whilst still using unsigned char instead of char and
without making your compiler non-conforming.

Or without changing the type or using a cast get rid of the required
diagnostic for this far simpler program...

int main(void)
{
unsigned char * var = "Fred";
return 0;
}
 
C

CBFalconer

pete said:
That was my suspicion.

This is to correct the misinformation from Richard (who is PLONKED
here). The following is the C99 description of tolower. Note that
it does nothing if c is not upper on entry.

7.4.2.1 The tolower function

Synopsis
[#1]
#include <ctype.h>
int tolower(int c);

Description

[#2] The tolower function converts an uppercase letter to a
corresponding lowercase letter.

Returns

[#3] If the argument is a character for which isupper is
true and there are one or more corresponding characters, as
specified by the current locale, for which islower is true,
the tolower function returns one of the corresponding
characters (always the same one for any given locale);
otherwise, the argument is returned unchanged.
 
C

CBFalconer

Flash said:
James Kuyper wrote, On 21/12/08 18:49:

Because the values get passed to tolower which expects either EOF
or a positive value. Although in this case all the characters are
in the basic character set and so guaranteed to be positive whether
char is the same as signed or unsigned char, but one assumes that
those are not necessarily the strings to be used in the "real"
program.

From C99:

7.4 Character handling <ctype.h>

[#1] The header <ctype.h> declares several functions useful
for testing and mapping characters.155) In all cases the
argument is an int, the value of which shall be
representable as an unsigned char or shall equal the value
of the macro EOF. If the argument has any other value, the
behavior is undefined.
 
C

CBFalconer

James said:
.... snip ...

I sent out a 'cancel' notice shortly after sending that message.
Unfortunately, many news servers ignore 'cancel' notices.

Thus preventing idiotic script kiddies from destroying Usenet.
 
H

Harald van Dijk

The following is the C99 description of tolower. Note that it does
nothing if c is not upper on entry.

Yes, it does. tolower returns c if c is not an uppercase letter. I think
that may have been Richard's point.
 
J

jameskuyper

CBFalconer said:
This is to correct the misinformation from Richard (who is PLONKED
here). The following is the C99 description of tolower. Note that
it does nothing if c is not upper on entry.

According to the text that you yourself cite:
[#3] If the argument is a character for which isupper is
true and there are one or more corresponding characters, as
specified by the current locale, for which islower is true,
the tolower function returns one of the corresponding
characters (always the same one for any given locale);
otherwise, the argument is returned unchanged.

So if isupper(c) is false, tolower(c), doesn't "do nothing", it
returns the value of c.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,816
Messages
2,569,713
Members
45,502
Latest member
Andres34P

Latest Threads

Top