Writing through an unsigned char pointer

Noob · Apr 11, 2013

Hello,

Is the following code valid:

static void foo(unsigned char *buf)
{
int i;
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long arr[4];
foo(arr);
}

The compiler points out that (unsigned long *) is not
compatible with (unsigned char *).

So I cast to the expected type:

void bar(void)
{
unsigned long arr[4];
foo((unsigned char *)arr);
}

I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them. Also the fact
the "real" type is unsigned means there are no trap
representations, right?

Regards.

Nobody · Apr 11, 2013

I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them.

It's allowed, but it isn't specified what the result will be.

An unsigned long can be any number of bytes (so long as it's at least 32
bits, which isn't necessarily the same thing as 4 bytes); on the most
common platforms, it will be either 4 bytes or 8 bytes. The byte order can
be big-endian, little-endian, Vax-endian or something else.

James Kuyper · Apr 11, 2013

Hello,

Is the following code valid:

static void foo(unsigned char *buf)
{
int i;
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long arr[4];
foo(arr);
}

This code assumes that sizeof(arr) == 16, or equivalently,
sizeof(long)==4. You should either make the behavior of foo() depend
upon sizeof(long), or at least put in assert(sizeof(long)==4).

The compiler points out that (unsigned long *) is not
compatible with (unsigned char *).

So I cast to the expected type:

void bar(void)
{
unsigned long arr[4];
foo((unsigned char *)arr);
}

I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them. ...

Click to expand...

It's allowed, for both purposes.

... Also the fact
the "real" type is unsigned means there are no trap
representations, right?

Click to expand...

Footnote 53 of n1570.pdf says, with respect to unsigned integer types,
that "Some combinations of padding bits might generate trap
representations." Unsigned char isn't allowed to have any padding bits,
but unsigned long certainly can.

Jorgen Grahn · Apr 11, 2013

Hello,

Is the following code valid:

static void foo(unsigned char *buf)
{
int i;
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long arr[4];
foo(arr);
}

The compiler points out that (unsigned long *) is not
compatible with (unsigned char *).

So I cast to the expected type:

void bar(void)
{
unsigned long arr[4];
foo((unsigned char *)arr);
}

I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them. Also the fact
the "real" type is unsigned means there are no trap
representations, right?

Don't know what the language guarantees.

Even if I knew about trap representations and stuff, and knew a long
is four chars on my target, it would worry me that I have no idea what
the 16 chars look like when viewed as 4 longs. I would have
introduced endianness issues into the program, and that's never a good
thing -- they tend to spread.

If I were you, at this point I'd sidestep the problem by rewriting the
code without unusual casts. I don't think I've ever seen a problem
which could be solved by things like the casting above, but not by
everyday code without casts. (Ok, except for badly written third-party
APIs, perhaps.)

You don't show the problem you're trying to solve, so I cannot suggest
an alternative (except for the obvious and trivial change to bar()).

/Jorgen

Tim Rentsch · Apr 12, 2013

Nobody said:
I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them.

Click to expand...

It's allowed, but it isn't specified what the result will be.
[snip]

Not quite right. The Standard does specify the behavior in
such cases, as implementation-defined. So even though the
results won't be portable, you can find out what they will
be.

Tim Rentsch · Apr 12, 2013

Jorgen Grahn said:
Hello,

Is the following code valid:

static void foo(unsigned char *buf)
{
int i;
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long arr[4];
foo(arr);
}

The compiler points out that (unsigned long *) is not
compatible with (unsigned char *).

So I cast to the expected type:

void bar(void)
{
unsigned long arr[4];
foo((unsigned char *)arr);
}

I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them. Also the fact
the "real" type is unsigned means there are no trap
representations, right?

Click to expand...

Don't know what the language guarantees.

Even if I knew about trap representations and stuff, and knew a long
is four chars on my target, it would worry me that I have no idea what
the 16 chars look like when viewed as 4 longs. I would have
introduced endianness issues into the program, and that's never a good
thing -- they tend to spread. [snip]

If CHAR_BIT == 8 and sizeof (long) == 4 (both of which are pretty
likely under the circumstances, and can easily be tested statically),
then unsigned long cannot have a trap representation, and it is easy
to (write code that will) discover just what the representation of
unsigned long is (and also signed long, although signed long might
have one trap representation, which is identifiable by checking the
value of LONG_MIN).

It's probably true that 99 times out of 100 it's better to avoid
using character-type access of other types. Even so, it's better to
know what the Standard actually does require, and to convey that
understanding to other people. Promoting a style of making decisions
out of uncertainty, where there is no need for that uncertainty, is a
bad habit to instill in people.

Jorgen Grahn · Apr 12, 2013

Jorgen Grahn said:
Jorgen Grahn said:

Hello,

Is the following code valid:

static void foo(unsigned char *buf)
{
int i;
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long arr[4];
foo(arr);
}

The compiler points out that (unsigned long *) is not
compatible with (unsigned char *).

So I cast to the expected type:

void bar(void)
{
unsigned long arr[4];
foo((unsigned char *)arr);
}

I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them. Also the fact
the "real" type is unsigned means there are no trap
representations, right?

Click to expand...

Don't know what the language guarantees.

Even if I knew about trap representations and stuff, and knew a long
is four chars on my target, it would worry me that I have no idea what
the 16 chars look like when viewed as 4 longs. I would have
introduced endianness issues into the program, and that's never a good
thing -- they tend to spread. [snip]

Click to expand...

....
It's probably true that 99 times out of 100 it's better to avoid
using character-type access of other types. Even so, it's better to
know what the Standard actually does require, and to convey that
understanding to other people. Promoting a style of making decisions
out of uncertainty, where there is no need for that uncertainty, is a
bad habit to instill in people.

Are you saying I promote such a style?

Uncertainty is not the reason I to stay away from weird casts.
But yes, a side effect is that I don't have to waste energy trying
to find out what they mean, in relation to the language and in
relation to my compiler/environment.

I do not want to forbid anyone from discussing casts, trap
representations and UB in this thread. But someone /also/ needed to
point out the obvious: that there are easy, portable and readable
alternatives.

/Jorgen

Malcolm McLean · Apr 12, 2013

Is the following code valid:

static void foo(unsigned char *buf)
{
int i;

for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long arr[4];

foo(arr);

}

It's allowed to cast any block of memory to unsigned chars, and treat it
as sequence of raw bytes or bits. The chars cannot trap, and the compiler
warning is in error, except that void * is a slightly more fiddly
way of achieving the same thing.

However going the other way, from a sequence of bytes to a higher-level
structure like a long, is a bit problematic. On any platform you are likely
to run on, longs will by two's complement. But they might be big endian
or little endian, and occasionally they might not be four bytes.
Technically the sequence 0 1 2 3 could be a trap representation for a
long, but that's so highly unlikely you can ignore the issue.

So you have to know the binary representation of a long. The only portable
way is to copy from one to another.

Noob · Apr 12, 2013

Jorgen said:
Hello,

Is the following code valid:

static void foo(unsigned char *buf)
{
int i;
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long arr[4];
foo(arr);
}

The compiler points out that (unsigned long *) is not
compatible with (unsigned char *).

So I cast to the expected type:

void bar(void)
{
unsigned long arr[4];
foo((unsigned char *)arr);
}

I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them. Also the fact
the "real" type is unsigned means there are no trap
representations, right?

Click to expand...

Don't know what the language guarantees.

Even if I knew about trap representations and stuff, and knew a long
is four chars on my target, it would worry me that I have no idea what
the 16 chars look like when viewed as 4 longs. I would have
introduced endianness issues into the program, and that's never a good
thing -- they tend to spread.

If I were you, at this point I'd sidestep the problem by rewriting the
code without unusual casts. I don't think I've ever seen a problem
which could be solved by things like the casting above, but not by
everyday code without casts. (Ok, except for badly written third-party
APIs, perhaps.)

You don't show the problem you're trying to solve, so I cannot suggest
an alternative (except for the obvious and trivial change to bar()).

I'll tell you the whole story, so you can cringe like I did!

What I'm REALLY working with is a 128-bit AES key.

My "sin" is use the knowledge that CHAR_BIT is 8 on the
two platforms I work with, which is why I'm implicitly
using an unsigned char buf[16].

HOWEVER, on one of the two platforms, the geniuses who
implemented the API thought it would be a good idea to
pass the key in an array of 4 uint32_t

Thus what's missing from my stripped-down example is:

extern nasty_API_func(uint32_t *key);

static void foo(unsigned char *buf)
{
int i;
/* not the actual steps to populate buf */
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long key[4];
foo((unsigned char *)key);
nasty_API_func(key);
}

I don't think I have much choice than to cast (or use
an implicit conversion to void *) given the constraints
of the API, do I?

Regards.

James Kuyper · Apr 12, 2013

On 04/12/2013 07:30 AM, Noob wrote:
....

What I'm REALLY working with is a 128-bit AES key.

My "sin" is use the knowledge that CHAR_BIT is 8 on the
two platforms I work with, which is why I'm implicitly
using an unsigned char buf[16].

That's not too deadly a sin, it's an accurate assumption on a great many
platforms, but other values do exist: 16 is a popular value on some
DSPs. At least once, somewhere with any of your code that makes that
assumption, you should do something like

#if CHAR_BIT != 8
#error This code requires CHAR_BIT == 8
#endif

HOWEVER, on one of the two platforms, the geniuses who
implemented the API thought it would be a good idea to
pass the key in an array of 4 uint32_t

Thus what's missing from my stripped-down example is:

extern nasty_API_func(uint32_t *key);

static void foo(unsigned char *buf)
{
int i;
/* not the actual steps to populate buf */
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long key[4];
foo((unsigned char *)key);
nasty_API_func(key);
}

I don't think I have much choice than to cast (or use
an implicit conversion to void *) given the constraints
of the API, do I?

Well, at the very least you should change "unsigned long" to uint32_t.

Secondly, you might need to be worried about the endianess of uint32_t.
Your code might set key[0] to 0x1020304 or 0x4030201 (among other
possibilities); do you know which of those two values the
nasty_API_func() should be receiving? If the API is supported only on
platforms with a single endianess, you can get away with building that
knowledge into foo(). However, if API is defined for platforms with
different endianesses, and requires that key[0] have the value
0x1020304, regardless of which endianess uint32_t has, you'll have to
fill in key[] using << and | rather than by accessing it as an array of
char.

James Kuyper · Apr 12, 2013

On 04/12/2013 07:30 AM, Noob wrote: ....

static void foo(unsigned char *buf)
{
int i;
/* not the actual steps to populate buf */
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long key[4];
foo((unsigned char *)key);
nasty_API_func(key);
}

Click to expand...

....
Your code might set key[0] to 0x1020304 or 0x4030201 (among other

Correction: 0x00010203 or 0x03020100.

James Kuyper · Apr 12, 2013

On 04/12/2013 07:30 AM, Noob wrote:
...

What I'm REALLY working with is a 128-bit AES key.

My "sin" is use the knowledge that CHAR_BIT is 8 on the
two platforms I work with, which is why I'm implicitly
using an unsigned char buf[16].

Click to expand...

That's not too deadly a sin, it's an accurate assumption on a great many
platforms, but other values do exist: 16 is a popular value on some
DSPs. At least once, somewhere with any of your code that makes that
assumption, you should do something like

#if CHAR_BIT != 8
#error This code requires CHAR_BIT == 8
#endif

Click to expand...

I would say that unless you are writing exceptionally cross-platform
code, just assume CHAR_BIT is 8. It is true that there are a few cpus
with 16-bit, 32-bit, or even 24-bit "characters", but if you are not
actually using them, then your life will be easier - and your code
clearer ...

I can't see my suggestion as something that impairs the clarity of the
code. Rather the opposite, IMO.

... and neater, and therefore better - if you forget about them.
When the day comes that you have to write code for a TMS320 with 16-bit
"char", you will find that you have so many other problems trying to use
existing code that you will be better off using code written /only/ for
these devices. So you lose nothing by assuming CHAR_BIT is 8.

Portability is always a useful thing, but it is not the most important
aspect of writing good code - taken to extremes it leads to messy and
unclear code (which is always a bad thing), and often inefficient code.

In my work - which is programming on a wide variety of embedded systems
- code always implicitly assumes CHAR_BIT is 8, all integer types are
plain powers-of-two sizes with two's compliment for signed types, there
are no "trap bits", etc. Since there are no realistic platforms where
this is not true, worrying about them is a waste of time. (There are
some platforms that support additional types such as 20-bit or 40-bit
types - but these are always handled explicitly if they are used.) ...

I attach greater importance to portability than you do. I do worry
mainly about "realistic" platforms, but I also write my code to cope, to
the extent possible, with unrealistic but conforming possibilities.
Every portability issue that you mention has been important in the past,
or the standard would not have been written to accommodate such systems.
It's a fair bet that, even if they aren't important right now, one or
more of those issues will become important again in the future. I don't
want my programs to be the ones that fail because of such changes.

... I
can't assume endianness for general code, and I can't assume integer
sizes - so <stdint.h> sizes are essential (as you suggest below).

I didn't suggest using uint32_t because it has a standardized size, but
simply because the relevant third-party function declaration uses it:
....

extern nasty_API_func(uint32_t *key); ....
void bar(void)
{
unsigned long key[4];
foo((unsigned char *)key);
nasty_API_func(key);
}

Click to expand...

Click to expand...

My first rule for choosing the type of a variable is that it should be
compatible with the API of the standard library or third-party functions
it will be used with, if possible. Change "should" to "must" and drop ",
if possible", if the variable is being passed to that function via a
pointer. For my own functions, I declare the function to match the data,
rather than vice-versa.

ImpalerCore · Apr 12, 2013

Jorgen said:
Jorgen said:

Hello,
Is the following code valid:
static void foo(unsigned char *buf)
{
int i;
for (i = 0; i < 16; ++i) buf = i;
}
void bar(void)
{
unsigned long arr[4];
foo(arr);
}
The compiler points out that (unsigned long *) is not
compatible with (unsigned char *).
So I cast to the expected type:
void bar(void)
{
unsigned long arr[4];
foo((unsigned char *)arr);
}
I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them. Also the fact
the "real" type is unsigned means there are no trap
representations, right?

Click to expand...

Click to expand...

Don't know what the language guarantees.

Click to expand...

Even if I knew about trap representations and stuff, and knew a long
is four chars on my target, it would worry me that I have no idea what
the 16 chars look like when viewed as 4 longs. I would have
introduced endianness issues into the program, and that's never a good
thing -- they tend to spread.

Click to expand...

If I were you, at this point I'd sidestep the problem by rewriting the
code without unusual casts. I don't think I've ever seen a problem
which could be solved by things like the casting above, but not by
everyday code without casts. (Ok, except for badly written third-party
APIs, perhaps.)

Click to expand...

You don't show the problem you're trying to solve, so I cannot suggest
an alternative (except for the obvious and trivial change to bar()).

Click to expand...

I'll tell you the whole story, so you can cringe like I did!

What I'm REALLY working with is a 128-bit AES key.

My "sin" is use the knowledge that CHAR_BIT is 8 on the
two platforms I work with, which is why I'm implicitly
using an unsigned char buf[16].

If you really want 8-bit characters, why not use uint8_t? Any system
that doesn't support that type should give you a heads-up with a
compiler error.

HOWEVER, on one of the two platforms, the geniuses who
implemented the API thought it would be a good idea to
pass the key in an array of 4 uint32_t

Thus what's missing from my stripped-down example is:

extern nasty_API_func(uint32_t *key);

static void foo(unsigned char *buf)
{
int i;
/* not the actual steps to populate buf */
for (i = 0; i < 16; ++i) buf = i;

}

void bar(void)
{
unsigned long key[4];
foo((unsigned char *)key);
nasty_API_func(key);

}

I don't think I have much choice than to cast (or use
an implicit conversion to void *) given the constraints
of the API, do I?

Click to expand...

The most portable method is to simply memcpy the relevant portions
from the uint8_t[16] array to uint32_t[4].

\code
void bar(void)
{
uint8_t buf[16];
uint32_t key[4];
foo(buf);

/* memcpy bytes buf[0-3] to key[0] */
/* memcpy bytes buf[4-7] to key[1] */
/* memcpy bytes buf[8-11] to key[2] */
/* memcpy bytes buf[12-15] to key[3] */

nasty_API_func(key);
}
\endcode

I use the technique to read IEEE-754 'float' and 'double' as
"integers" from a packet or in a file, use a 'ntohl' type function to
deal with endian issues, and then memcpy the bytes from a 'uint32_t'
or 'uint64_t' type into a 'float' or 'double' type.

I know that it's not guaranteed to be portable (trap representations,
non IEEE-754 floating point representations, etc.), but if there's a
better methodology, I'd like to know as I need to read IEEE-754 4 and
8 byte floating point values collected from live traffic or packet
dumps (that I have no control over). I do verify that 'sizeof
(uint32_t) == sizeof (float)' and similarly for 'double'.

Best regards,
John D.

Jorgen Grahn · Apr 13, 2013

Jorgen Grahn wrote: ....

If I were you, at this point I'd sidestep the problem by rewriting the
code without unusual casts. I don't think I've ever seen a problem
which could be solved by things like the casting above, but not by
everyday code without casts. (Ok, except for badly written third-party
APIs, perhaps.)

You don't show the problem you're trying to solve, so I cannot suggest
an alternative (except for the obvious and trivial change to bar()).

Click to expand...

I'll tell you the whole story, so you can cringe like I did!

What I'm REALLY working with is a 128-bit AES key.

My "sin" is use the knowledge that CHAR_BIT is 8 on the
two platforms I work with, which is why I'm implicitly
using an unsigned char buf[16].

HOWEVER, on one of the two platforms, the geniuses who
implemented the API thought it would be a good idea to
pass the key in an array of 4 uint32_t

Ok, that's one of the "badly written APIs" scenarios I was thinking
of. Crypto, compression and checksumming code often seems to play
fast and loose with the type system.

If it's just a matter of the 128-bit key's representation, you could
hide that part in a translation function.

/Jorgen

glen herrmannsfeldt · Apr 13, 2013

Jorgen Grahn said:
On Fri, 2013-04-12, Noob wrote: (snip)

(snip)

Ok, that's one of the "badly written APIs" scenarios I was thinking
of. Crypto, compression and checksumming code often seems to play
fast and loose with the type system.

If it's just a matter of the 128-bit key's representation, you could
hide that part in a translation function.

Just a note, that an AES key is not a 128 bit number, but
a 128 bit string of bits. (That is, the bits don't have place
value like integers have.)

That doesn't mean that you don't have to worry about bit,
byte, or word ordering, but that the question isn't the
same as when working with integers.

-- glen

Tim Rentsch · Apr 14, 2013

Jorgen Grahn said:
Jorgen Grahn said:

On Thu, 2013-04-11, Noob wrote:
Hello,

Is the following code valid:

static void foo(unsigned char *buf)
{
int i;
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long arr[4];
foo(arr);
}

The compiler points out that (unsigned long *) is not
compatible with (unsigned char *).

So I cast to the expected type:

void bar(void)
{
unsigned long arr[4];
foo((unsigned char *)arr);
}

I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them. Also the fact
the "real" type is unsigned means there are no trap
representations, right?

Don't know what the language guarantees.

Even if I knew about trap representations and stuff, and knew a long
is four chars on my target, it would worry me that I have no idea what
the 16 chars look like when viewed as 4 longs. I would have
introduced endianness issues into the program, and that's never a good
thing -- they tend to spread. [snip]

Click to expand...

...
It's probably true that 99 times out of 100 it's better to avoid
using character-type access of other types. Even so, it's better to
know what the Standard actually does require, and to convey that
understanding to other people. Promoting a style of making decisions
out of uncertainty, where there is no need for that uncertainty, is a
bad habit to instill in people.

Click to expand...

Are you saying I promote such a style?

Of course I don't know what you meant to suggest, or even might
have meant to suggest. But I do think someone reading the
posting may come away with the impression that this advice was
being offered, and take it to heart (even if perhaps not being
conscious of doing so), whether you meant it that way or not.

Uncertainty is not the reason I to stay away from weird casts.
But yes, a side effect is that I don't have to waste energy trying
to find out what they mean, in relation to the language and in
relation to my compiler/environment.

Click to expand...

As a general principle, I think saying most casts should be
avoided is good advice to give. But that advice is good
because in many or most cases a "suspect" cast is an indication
that whoever wrote the code was thinking about the problem the
wrong way -- not because casting is always dangerous, or poorly
defined, or necessarily a poor design choice. I think it's
important not to blur the distinction between those two lines
of reasoning, and I think your comments are likely to be taken
that way, even if that isn't how you meant them.

I do not want to forbid anyone from discussing casts, trap
representations and UB in this thread. But someone /also/
needed to point out the obvious: that there are easy,
portable and readable alternatives.

Click to expand...

Forgive me if this sounds harsh, but that seems like a pretty
arrogant statement. You don't know what the questioner wants
to do exactly, or why he wants to do it. Yet you presume to
give advice assuming that you do know, even after admitting you
don't know exactly what the rules of the language are for the
question he's asking about. IMO the message that came across
is very much the wrong message.

In cases like this one, I think a better way to proceed is to
first answer the question that was asked: "The Standard says
that blah blah blah...". Then, second, to point out the kinds
of problems that might come up, and how to guard against them:
"It looks like what you're doing assumes CHAR_BIT == 8 and blah
blah blah, which can be staticly tested using blah and blah.
Also different byte orderings might cause endianness problems,
so you might want to check that with blah blah blah." Lastly,
after the preceeding two kinds of responses, then and only then
give the general or generic kinds of advice: "Usually casting
indicates some kind of deeper problem in the approach being
used. As a general rule it's better to avoid casting, both
because it relies on less common parts of the Standard, and
because it tends to make programs more brittle in terms of
depending on specific implementation choices. You might
consider writing this instead as blah blah blah (assuming of
course that your application is amenable to that), and see
if the question might be avoided altogether."

Again, I'm sorry if my response here comes across as too
harsh, I don't mean it to be. I appreciate what you are
trying to do -- it's just that how it comes across is (I
think) not how you mean it. (And hopefully what I am trying
to say is coming across as I mean it.)

Tim Rentsch · Apr 15, 2013

Noob said:
[asking about accessing an (unsigned long [4]) array
using (unsigned char *)]

I'll tell you the whole story, so you can cringe like I did!

What I'm REALLY working with is a 128-bit AES key.

My "sin" is use the knowledge that CHAR_BIT is 8 on the
two platforms I work with, which is why I'm implicitly
using an unsigned char buf[16].

HOWEVER, on one of the two platforms, the geniuses who
implemented the API thought it would be a good idea to
pass the key in an array of 4 uint32_t

Thus what's missing from my stripped-down example is:

extern nasty_API_func(uint32_t *key);

static void foo(unsigned char *buf)
{
int i;
/* not the actual steps to populate buf */
for (i = 0; i < 16; ++i) buf = i;
}

void bar(void)
{
unsigned long key[4];
foo((unsigned char *)key);
nasty_API_func(key);
}

I don't think I have much choice than to cast (or use
an implicit conversion to void *) given the constraints
of the API, do I?

Actually you do:

/* ... foo() as above ... */

void
bar( void ){
union { uint32_t u32[4]; unsigned char uc[16]; } key;
extern char checkit[ sizeof key.u32 == sizeof key.uc ? 1 : -1 ];
foo( key.uc );
nasty_API_func( key.u32 );
}

IMO writing bar() like this gives a better indication of what's
going on, and why. (It also checks to make sure the different
types used are appropriately sympatico in this implementation,
but that's just my reflexive programming habit.)

Ike Naar · Apr 15, 2013

What I'm REALLY working with is a 128-bit AES key.

My "sin" is use the knowledge that CHAR_BIT is 8 on the
two platforms I work with, which is why I'm implicitly
using an unsigned char buf[16].

HOWEVER, on one of the two platforms, the geniuses who
implemented the API thought it would be a good idea to
pass the key in an array of 4 uint32_t

And the API lacks a function to initialize a key?

Tim Rentsch · Apr 15, 2013

David Brown said:
On 04/12/2013 07:30 AM, Noob wrote:
...

What I'm REALLY working with is a 128-bit AES key.

My "sin" is use the knowledge that CHAR_BIT is 8 on the
two platforms I work with, which is why I'm implicitly
using an unsigned char buf[16].

Click to expand...

That's not too deadly a sin, it's an accurate assumption on a great many
platforms, but other values do exist: 16 is a popular value on some
DSPs. At least once, somewhere with any of your code that makes that
assumption, you should do something like

#if CHAR_BIT != 8
#error This code requires CHAR_BIT == 8
#endif

Click to expand...

I would say that unless you are writing exceptionally cross-platform
code, just assume CHAR_BIT is 8. It is true that there are a few
cpus with 16-bit, 32-bit, or even 24-bit "characters", but if you
are not actually using them, then your life will be easier - and
your code clearer and neater, and therefore better - if you forget
about them. When the day comes that you have to write code for a
TMS320 with 16-bit "char", you will find that you have so many other
problems trying to use existing code that you will be better off
using code written /only/ for these devices. So you lose nothing by
assuming CHAR_BIT is 8.

Portability is always a useful thing, but it is not the most important
aspect of writing good code - taken to extremes it leads to messy and
unclear code (which is always a bad thing), and often inefficient code.

In my work - which is programming on a wide variety of embedded systems
- code always implicitly assumes CHAR_BIT is 8, all integer types are
plain powers-of-two sizes with two's compliment for signed types, there
are no "trap bits", etc. Since there are no realistic platforms where
this is not true, worrying about them is a waste of time. (There are
some platforms that support additional types such as 20-bit or 40-bit
types - but these are always handled explicitly if they are used.) I
can't assume endianness for general code, and I can't assume integer
sizes - so <stdint.h> sizes are essential (as you suggest below).

Assuming is okay. Implicitly assuming is not.

There are two benefits to writing an explicit check:

1. A clear indication of what condition (or assumption) is
being violated, in the rare event that one of those unusual
implementations is used.

2. More importantly, a clear indication to a human reader that
the assumption has been made, and isn't just an oversight
or an unconscious presumption made out of ignorance.

If you have a set of assumptions that you routinely rely on in
code that you write, put all the checks together in a header
file, and then just do

#include "standard_assumptions.h"

which provides all the benefits of making assumptions explicit,
but at essentially zero cost after the header is first written.

Tim Rentsch · Apr 15, 2013

ImpalerCore said:
Jorgen said:

On Thu, 2013-04-11, Noob wrote:
Hello,

Click to expand...

Is the following code valid:

Click to expand...

static void foo(unsigned char *buf)
{
int i;
for (i = 0; i < 16; ++i) buf = i;
}

Click to expand...

void bar(void)
{
unsigned long arr[4];
foo(arr);
}

Click to expand...

The compiler points out that (unsigned long *) is not
compatible with (unsigned char *).

Click to expand...

So I cast to the expected type:

Click to expand...

void bar(void)
{
unsigned long arr[4];
foo((unsigned char *)arr);
}

Click to expand...

I think it is allowed to cast to (unsigned char *)
but I don't remember if it's allowed only to inspect
(read) the values, or also to set them. Also the fact
the "real" type is unsigned means there are no trap
representations, right?

Click to expand...

Don't know what the language guarantees.

Click to expand...

Even if I knew about trap representations and stuff, and knew a long
is four chars on my target, it would worry me that I have no idea what
the 16 chars look like when viewed as 4 longs. I would have
introduced endianness issues into the program, and that's never a good
thing -- they tend to spread.

Click to expand...

If I were you, at this point I'd sidestep the problem by rewriting the
code without unusual casts. I don't think I've ever seen a problem
which could be solved by things like the casting above, but not by
everyday code without casts. (Ok, except for badly written third-party
APIs, perhaps.)

Click to expand...

You don't show the problem you're trying to solve, so I cannot suggest
an alternative (except for the obvious and trivial change to bar()).

Click to expand...

I'll tell you the whole story, so you can cringe like I did!

What I'm REALLY working with is a 128-bit AES key.

My "sin" is use the knowledge that CHAR_BIT is 8 on the
two platforms I work with, which is why I'm implicitly
using an unsigned char buf[16].

Click to expand...

If you really want 8-bit characters, why not use uint8_t?
Any system that doesn't support that type should give you
a heads-up with a compiler error.

Using unsigned char (perhaps via a typedef), and checking that
CHAR_BIT == 8 is more portable, specifically for people who might
be using C90 rather than C99/C11. But beyond that, there are
some subtle reasons why uint8_t should not be used as a synonym
for an 8-bit unsigned char:

1. A uint8_t type isn't necessarily compatible with an 8-bit
unsigned char. This may cause unexpected problems, eg,
passing a (unsigned char *) to a function that expects a
(uint8_t *).

2. A uint8_t type is not necessarily a character type. This
means accessing an area of memory using a (uint8_t *)
could produce undefined behavior through violation of
effective type rules, whereas using (unsigned char *)
would not.

These choices may seem rather far-fetched, but there is an
incentive for implementors to adopt them: by making uint8_t
be different from unsigned char, better aliasing information
can be gleaned in some cases, which might allow better code to
be generated. Existing compilers are already pushing pretty
hard at the edges of the undefined-behavior envelope, in the
quest for better and better performance; it isn't hard to
imagine an implementation making uint8_t be a non-character
integer type, if it results in better performance, or even if
the implementors just think it _might_ result in better
performance.

Adding adressing of IPv6 to program	1	Feb 16, 2023
Strict aliasing rule: pointer to void vs. pointer to char and transitivity	3	Mar 24, 2014
'unsigned long' to 'char[]' array	4	Aug 2, 2010
unsigned char	14	May 4, 2010
what's the size of type char foo[3] ?	32	Feb 22, 2012
const void * to const unsigned char (*)[2]	6	Dec 14, 2009
Writing "ls" under windows	39	Feb 21, 2014
Hello guys ! How do I convert a string from an array into numbers ? Javascript	3	Dec 19, 2022

Writing through an unsigned char pointer

Noob

Nobody

James Kuyper

Jorgen Grahn

Tim Rentsch

Tim Rentsch

Jorgen Grahn

Malcolm McLean

Noob

James Kuyper

James Kuyper

James Kuyper

ImpalerCore

Jorgen Grahn

glen herrmannsfeldt

Tim Rentsch

Tim Rentsch

Ike Naar

Tim Rentsch

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads