when is typecasting (unsigned char*) to (char*) dangerous?

E

Eric Sosman

b83503104 said:
When are they not consistent?

(In the future, please make sure your entire question
appears in the body of your message. Use the Subject: header
as a synopsis of your question, but do not rely on it to
carry your message unaided.)

Putting the Subject: and the body together, we have
this question:
> when is typecasting (unsigned char*) to (char*) dangerous?
> When are they not consistent?

The conversion itself is not dangerous. However, it
could be dangerous to use the resulting `char*' to access
the pointed-to characters. In theory, at least, some bit
patterns that are valid as `unsigned char' might be invalid
when considered as `char' -- that is, there might be no
actual `char' value corresponding to the bit pattern.

This concern is mostly theoretical, and would apply to
"exotic" architectures that use signed magnitude or ones'
complement representation for negative numbers and choose to
treat `char' as a signed type. On such systems there are two
distinct representations of zero (100...0 in binary notation
for S.M, or 111...1 for O.C.). When viewed as `unsigned char'
these bit patterns are easily distinguished from 000...0 --
but when viewed as `char', the "minus zero" is indistinguishable
from "positive zero." Even worse, the alternate forms could be
treated as "trap representations" and could cause your program
to misbehave.

Such concerns do not arise on the two's complement machines
that are prevalent nowadays. Nothing bad will happen when you
convert the `unsigned char*' to `char*', and nothing bad will
happen when you use the `char*' to inspect the bytes. (If an
expert disputes this assertion and mentions "padding bits" or
"trap representations," pay him no heed until and unless he
can exhibit a system whose `char' representation has such things.
If he says the word "DeathStar" or the abbreviation "DS," he's
just trying to scare you.)

Nonetheless, you must still be vigilant: `char' is unsigned
on some two's complement machines and signed on others. If you
use the `char*' to fetch a `char' value and then index an array
with the fetched value, you may find yourself trying to access
`crc_table[-128]', and this is not likely to be good for your
program's prospects of forward progress. Fetch a `char' value
and start right-shifting it until all the 1-bits "fall off the
end," and you may find yourself in an infinite loop. Beware!

There are some situations where type-punning is guaranteed
to be safe. Given a pointer to any data object, you can safely
convert that pointer to `unsigned char*' and then inspect the
individual bytes of the object. You can safely convert between
a struct pointer and a pointer to its first element, or between
a union pointer and a pointer to any of its elements, or between
any data pointer at all and a `void*'. Sometimes, conversions
of this kind are essential -- but if you find yourself writing
"a lot" of them, it's probably a sign that your data structures
are not well-designed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,571
Members
45,045
Latest member
DRCM

Latest Threads

Top