C Standard re: integral type bit representation

J

joshc

So while I've been programming in C for 5 years now while I was in
school, I only recently started having to worry about portability and
writing C code in conformance with the C standard. I did try to look
through various parts of the standard for my answers, but I didn't find
what I was looking for.

First off, the reason for the following questions are because I want to
know if what I'm about to describe is in conformance with the C99
standard. I want to test if two signed integers have the same 'sign' so
I am XORing them and seeing if the result is positive or zero. I am not
sure if this is in conformance with the standard; my guess is that it
is not because the standard doesn't define how integers are to be
stored, i.e. two's complement, etc.

I confused myself because while the C standard is not supposed to
define the execution environment, a few things made me think. Please
read through my reasoning and help correct any flaws in my thinking.
The standard says that shifting an unsigned or positive integer left is
like a multiplication by 2. Does this mean that if a processor is not a
binary machine, an implementation for that processor will have to
somehow "fake" integers being stored as a binary number? This also got
me into thinking what bitwise operators would mean on processors that
might deal in base-10 for some odd reason or another.

I know this is long, but I know I'm missing some big picture idea here
or some fundamental concept.

Thanks.
 
I

infobahn

joshc said:
First off, the reason for the following questions are because I want to
know if what I'm about to describe is in conformance with the C99
standard. I want to test if two signed integers have the same 'sign' so
I am XORing them and seeing if the result is positive or zero. I am not
sure if this is in conformance with the standard; my guess is that it
is not because the standard doesn't define how integers are to be
stored, i.e. two's complement, etc.

Actually, if the two integers are of the same type, that's fine, since
the sign bits will at least be in the same position!
I confused myself because while the C standard is not supposed to
define the execution environment, a few things made me think. Please
read through my reasoning and help correct any flaws in my thinking.
The standard says that shifting an unsigned or positive integer left is
like a multiplication by 2.

Provided you don't run off the end, that is. :)
Does this mean that if a processor is not a
binary machine, an implementation for that processor will have to
somehow "fake" integers being stored as a binary number?

Yes.
 
E

Eric Sosman

infobahn said:
Actually, if the two integers are of the same type, that's fine, since
the sign bits will at least be in the same position!

The two integers will necessarily be of the same type
when the ^ is evaluated; if they differed originally, the
"usual arithmetic conversions" (6.3.1.8) will occur.

However, it doesn't help. Consider a signed-magnitude
machine and the two values +N and -N. XOR them and you'll
get "minus zero," which is equal to zero (unless it's a trap
value, which would be even worse), so the test would decide
that additive inverses have the same sign.

It looks like the O.P. considers zero to be positive
(because that's the behavior his test would produce on a
two's-complement system, which he's probably using), so a
portable and straightforward way to write the test is

if ( (x >= 0) == (y >= 0) )
puts ("same");
else
puts ("different");

(If anybody cavils about the "inefficiency" of this
test I'll become angry, and you wouldn't like me when
I'm angry ...)
 
J

joshc

Eric said:
However, it doesn't help. Consider a signed-magnitude
machine and the two values +N and -N. XOR them and you'll
get "minus zero," which is equal to zero (unless it's a trap
value, which would be even worse), so the test would decide
that additive inverses have the same sign.

Great point!
It looks like the O.P. considers zero to be positive
(because that's the behavior his test would produce on a
two's-complement system, which he's probably using),

Actually it's not that I consider it positive, but basically I need to
test for two signed integers having the same sign where the zero case
doesn't matter to me.
so a
portable and straightforward way to write the test is

if ( (x >= 0) == (y >= 0) )
puts ("same");
else
puts ("different");

Perfect, that's what I tried to accomplish with comparison operators
but my implementation was much uglier than the above.

A question that came up as a result of infobahn's answer regarding the
sign bit being in the same position; even if a certain processor uses
two's complement there is no guarantee that an implementation for that
processor will use two's complement to store integers, right(even
though that would be pretty silly)?

Also, who dictates what the "binary representation" of an integer will
be as it relates to how the standard references binary representations
when talking of the bitwise operators, etc.? I guess there is only one
way to store an unsigned integer but for signed I guess it's the
implementation(compiler)?

Thanks again.
 
E

Eric Sosman

joshc said:
A question that came up as a result of infobahn's answer regarding the
sign bit being in the same position; even if a certain processor uses
two's complement there is no guarantee that an implementation for that
processor will use two's complement to store integers, right(even
though that would be pretty silly)?

I'm not sure what you're asking here, so what follows
may be off-target ...

Two's complement is a scheme for representing values.
When you say that a processor "uses" two's complement, you
mean that the processor represents operand and result values
according to the two's complement scheme. There's no way to
escape the processor's choice of representation: if you want
the processor to compute `Z = X op Y' you must provide the X
and Y values in the representation the processor expects, and
you must receive the Z value the way the processor provides it.

Now, you need not maintain X,Y,Z in the processor's native
format at all times; you could convert back and forth between
other representations if desired. Of course, you're using
the processor to perform the conversion, so the question of
whether this does or doesn't use "the native format" becomes
a little slippery. Still, it would be conceivable to store
integers in ones' complement, say, even if the underlying CPU
performs arithmetic in two's complement.

The original "C89" Standard required that integers seem
to be stored in "pure binary notation," thus ruling out some
representations like Gray codes. C89 permitted the use of
any imaginable scheme to represent signed integers (provided
that non-negative values looked the same as their unsigned
counterparts), while the current "C99" standard limits the
choice to two's complement, ones' complement, or signed
magnitude.

But there are (at least) two gaping holes in even this
more restrictive requirement. First, the Standard does not
prescribe any particular arrangement of the bits of the "pure
binary notation:" if an `int' happens to occupy four bytes,
any particular bit you might mention -- units' bit, fours' bit,
sign bit -- might reside in any of those four bytes. Some of
the bits in some of the bytes might not even be part of the
value at all: An implementation might choose to store a 32-bit
integer in four nine-bit bytes with four "padding bits" not
participating in the value. So the task of verifying that an
implementation actually uses "pure binary representation" is
not exactly trivial.

The second gaping hole is that integers need only *seem*
to use one of the permitted representations. A purely decimal
machine could, in principle, host a C implementation -- it
would need to jump through a lot of hoops to do so, but as long
as no program could actually detect the difference, the
implementation would be conforming under something known as
the "as if rule." Illusions are perfectly legal, so long as
they're perfect.

Usually, it is a bad idea to worry about representations,
and the programmer is almost always better off thinking about
the represented values. There are exceptions, of course: if
you need to tickle the "Request To Send" bit in the control
register of some piece of hardware, you need to know what
numerical value corresponds to that single bit. You also need
to be aware of the quirks of the possible representations --
computing `-INT_MIN' might not produce a positive number, for
example, so you probably shouldn't try. But by and large, the
C programmer should attend to the values and let the computer
worry about how to represent them.
 
J

joshc

Eric said:
Now, you need not maintain X,Y,Z in the processor's native
format at all times; you could convert back and forth between
other representations if desired. Of course, you're using
the processor to perform the conversion, so the question of
whether this does or doesn't use "the native format" becomes
a little slippery. Still, it would be conceivable to store
integers in ones' complement, say, even if the underlying CPU
performs arithmetic in two's complement.

That's exactly what I was asking-if the target architecture had
instructions for two's complement addition or something I was wondering
whether it could conceivably be that the implementation converts from a
one's complement representation to a two's complement representation so
as to be able to use the CPU's ALU ops. Thanks for the answer.
Usually, it is a bad idea to worry about representations,
and the programmer is almost always better off thinking about
the represented values. There are exceptions, of course: if
you need to tickle the "Request To Send" bit in the control
register of some piece of hardware, you need to know what
numerical value corresponds to that single bit. You also need
to be aware of the quirks of the possible representations --
computing `-INT_MIN' might not produce a positive number, for
example, so you probably shouldn't try. But by and large, the
C programmer should attend to the values and let the computer
worry about how to represent them.

Yeah I was curious about this because I am an embedded programmer so I
was trying to strike a balance between "standard", portable code, and
efficient code. Also with my computer architecture background I was
curious.

Anyways, it seems that 6.2.6 answers all of my questions that you guys
answered about how integral types, etc. are to be represented. For some
reason my annotated '89 standard didn't have this section but my '99
standard has this section.

Thanks.
 
K

Keith Thompson

joshc said:
Anyways, it seems that 6.2.6 answers all of my questions that you guys
answered about how integral types, etc. are to be represented. For some
reason my annotated '89 standard didn't have this section but my '99
standard has this section.

Let me guess, you're using a copy of Schildt's "The Annotated ANSI C
Standard". It's a useful way to get a copy of (most of) the C90
standard, but the annotations are worse than useless.

See Clive Feather's review at <http://www.lysator.liu.se/c/schildt.html>.
(Every copy of Schildt's book should be accompanied by a printout of
this review.)
 
D

DHOLLINGSWORTH2

joshc said:
So while I've been programming in C for 5 years now while I was in
school, I only recently started having to worry about portability and
writing C code in conformance with the C standard. I did try to look
through various parts of the standard for my answers, but I didn't find
what I was looking for.

First off, the reason for the following questions are because I want to
know if what I'm about to describe is in conformance with the C99
standard. I want to test if two signed integers have the same 'sign' so
I am XORing them and seeing if the result is positive or zero. I am not
sure if this is in conformance with the standard; my guess is that it
is not because the standard doesn't define how integers are to be
stored, i.e. two's complement, etc.

I confused myself because while the C standard is not supposed to
define the execution environment, a few things made me think. Please
read through my reasoning and help correct any flaws in my thinking.
The standard says that shifting an unsigned or positive integer left is
like a multiplication by 2. Does this mean that if a processor is not a
binary machine, an implementation for that processor will have to
somehow "fake" integers being stored as a binary number? This also got
me into thinking what bitwise operators would mean on processors that
might deal in base-10 for some odd reason or another.

I know this is long, but I know I'm missing some big picture idea here
or some fundamental concept.

Thanks.
I'm really not sure what your asking?
but, an integer may be something intirely different on one machine than it
is on another, simply put zero, is the only consistant reference. A number
has a negative value when it is less than zero, a positive sign when it is
greater than zero, but zero itself needs no sign. Think about your logic.

Take a few more math classes and then you can explain to us why we dont have
any base-10 computers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,020
Latest member
GenesisGai

Latest Threads

Top