Unary minus and bitwise complement

Noob · Mar 19, 2013

Hello,

Considering a properly initialized unsigned int variable v,
are -v and ~(v-1) equivalent for any value of v, even on
the DS9k?

(let <-> mean "are equivalent for any value of v")

According to C89 3.3.3.3
~v <-> UINT_MAX - v

Thus
~(v-1) <-> UINT_MAX - (v-1)
~(v-1) <-> UINT_MAX - v + 1
~(v-1) <-> (UINT_MAX + 1) - v

Adding (UINT_MAX + 1) to an unsigned int variable is a NOP,
thus ~(v-1) <-> -v

I think this is true always, I don't think the "usual arithmetic
conversions" come in play in problematic ways?

Regards.

Eric Sosman · Mar 19, 2013

Hello,

Considering a properly initialized unsigned int variable v,
are -v and ~(v-1) equivalent for any value of v, even on
the DS9k?

(let <-> mean "are equivalent for any value of v")

According to C89 3.3.3.3
~v <-> UINT_MAX - v

Thus
~(v-1) <-> UINT_MAX - (v-1)
~(v-1) <-> UINT_MAX - v + 1
~(v-1) <-> (UINT_MAX + 1) - v

Adding (UINT_MAX + 1) to an unsigned int variable is a NOP,
thus ~(v-1) <-> -v

I think this is true always, I don't think the "usual arithmetic
conversions" come in play in problematic ways?

True always. (And still true in more recent Standards.)

The usual arithmetic conversions appear in the subexpression
`v-1', where they convert the `int' constant 1 to `unsigned int'
before the subtraction happens. So, no surprises there.

Neither the unary minus nor the complement operator use the
U.A.C., but only the "integral promotions." These "promote" an
`unsigned int' to, er, `unsigned int', so again: no surprises.

Bart van Ingen Schenau · Mar 19, 2013

Hello,

Considering a properly initialized unsigned int variable v, are -v and
~(v-1) equivalent for any value of v, even on the DS9k?

For integers with a conversion-rank equal to or higher than int: Yes.
For other integer types: Not guaranteed.

(let <-> mean "are equivalent for any value of v")

According to C89 3.3.3.3
~v <-> UINT_MAX - v

Thus
~(v-1) <-> UINT_MAX - (v-1)
~(v-1) <-> UINT_MAX - v + 1
~(v-1) <-> (UINT_MAX + 1) - v

Adding (UINT_MAX + 1) to an unsigned int variable is a NOP, thus ~(v-1)
<-> -v

I think this is true always, I don't think the "usual arithmetic
conversions" come in play in problematic ways?

If v has the type `unsigned char` or `unsigned short` AND if `signed int`
can represent all the values that can be stored in v, then the integer
promotions (part of the usual arithmetic conversions) do cause problems.
In that case, the unary minus and the bitwise complement operators both
operate on a signed type, so your equivalences no longer hold.

Regards.

Bart v Ingen Schenau

James Kuyper · Mar 19, 2013

Hello,

Considering a properly initialized unsigned int variable v,
are -v and ~(v-1) equivalent for any value of v, even on
the DS9k?

(let <-> mean "are equivalent for any value of v")

According to C89 3.3.3.3

That's 6.5.3.3 in the current standard.

~v <-> UINT_MAX - v

Thus
~(v-1) <-> UINT_MAX - (v-1)
~(v-1) <-> UINT_MAX - v + 1
~(v-1) <-> (UINT_MAX + 1) - v

Adding (UINT_MAX + 1) to an unsigned int variable is a NOP,
thus ~(v-1) <-> -v

I think this is true always, I don't think the "usual arithmetic
conversions" come in play in problematic ways?

Since v is unsigned int, and 1 has the type int, the usual arithmetic
conversions require that 1 be converted to unsigned int, and then
everything goes through fine.

The tricky case is an unsigned type where TYPE_MAX < INT_MAX, and v==0.
In that case, the unsigned value is converted to int, so the result is
~(-1), which has different results depending upon whether 2's
complement, 1's complement, or sign-magnitude representation is used.
The DS9k, of course, only uses 2's complement representation in it's
floating point implementation.

James Kuyper · Mar 19, 2013

True always. (And still true in more recent Standards.)

The usual arithmetic conversions appear in the subexpression
`v-1', where they convert the `int' constant 1 to `unsigned int'
before the subtraction happens. So, no surprises there.

Neither the unary minus nor the complement operator use the
U.A.C., but only the "integral promotions." These "promote" an
`unsigned int' to, er, `unsigned int', so again: no surprises.

The U.A.C come into play through the v-1 sub-expression.

Lew Pitcher · Mar 19, 2013

Hello,

Considering a properly initialized unsigned int variable v,
are -v and ~(v-1) equivalent for any value of v, even on
the DS9k?

(let <-> mean "are equivalent for any value of v")

According to C89 3.3.3.3
~v <-> UINT_MAX - v

Thus
~(v-1) <-> UINT_MAX - (v-1)
~(v-1) <-> UINT_MAX - v + 1
~(v-1) <-> (UINT_MAX + 1) - v

Adding (UINT_MAX + 1) to an unsigned int variable is a NOP,
thus ~(v-1) <-> -v

Something about this reasoning troubles me; two somethings, actually.

First off, ~(v-1) is the canonical definition of 2's complement negation of
a value v. Your analysis seems to preclude the C standard's requirement
for "pure binary" representation of unsigned integers, and sign&magnitude
and ones-complement representation for signed integers, by defining
negation as a two's-complement operation. I'll have to spend some time
thinking about this point.

Secondly, your final derivation
~(v-1) <-> -v
no longer fits within the given parameters of unsigned integers, as there is
no viable representation /as an unsigned integer/ of a negative value. If
you had stopped at
~(v-1) <-> (UINT_MAX + 1) - v
this would not be a problem, as neither side of this equation results in a
negative value.

glen herrmannsfeldt · Mar 19, 2013

James Kuyper said:
On 03/19/2013 10:49 AM, Noob wrote: (snip)
(snip)

Since v is unsigned int, and 1 has the type int, the usual arithmetic
conversions require that 1 be converted to unsigned int, and then
everything goes through fine.

The tricky case is an unsigned type where TYPE_MAX < INT_MAX, and v==0.
In that case, the unsigned value is converted to int, so the result is
~(-1), which has different results depending upon whether 2's
complement, 1's complement, or sign-magnitude representation is used.

Other than the DS9K, the sign-magnitude machines that I know about
are word addressed. They might not have any type smaller than int.

The last one I know about is the 7094, successor to the 7090 and 709.
With IBM OSs, they used six BCDIC (IBM called it BCD) characters
per 36 bit word. As I understand it, the card reader reads each
card punch row into two 36 bit words. Software then converts that
to six characters per word. Note that the last eight card colunmns
aren't read. (Possibly that was the 704, and not the 709. It is,
at least, the reason Fortran ignores columns 73-80 (in fixed-form).

Nine track tape drives didn't come along until later. Seven track
drives, six plus (even or odd) parity were used. So, what would
CHAR_BIT be on the 7094?

For ones complement, there are the CDC machines (10 six bit characters
in a 60 bit word) and Unisys machines, but popular long after the 7090.

The DS9k, of course, only uses 2's complement representation in it's
floating point implementation.

In PDP-10 style (2's complement the whole word) or DSP style
(significand only)?

The PDP-10 allows one to use integer compare instructions on floating
point values (and get the right result). Somewhat convenient for
sorting, such that you don't need to know which fields are integer
and which aren't.

Seems to me that C pretty much requires binary for integer types,
but, for example, IEEE 754-2008 style decimal float should work
(which is sign magnitude). But a nine's complement or ten's
complement decimal float form should also work.

-- glen

army1987 · Mar 21, 2013

Not with ones complement or signed magnitude.

Click to expand...

The OP is talking about *unsigned* variables.

Noob · Mar 21, 2013

China said:
Doesn't matter. With one's complement -v = ~v,
and signed magnitude -v = 0x800...00 ^ v.

The "signedness" of the variable does matter. The standard fully
specifies the behavior of the ~ operator for unsigned types:

(C89) 3.3.3.3 Unary arithmetic operators

Constraints

The operand of the unary + or - operator shall have arithmetic
type; of the ~ operator, integral type; of the ! operator, scalar
type.

Semantics

The result of the unary + operator is the value of its operand.
The integral promotion is performed on the operand, and the result has
the promoted type.

The result of the unary - operator is the negative of its operand.
The integral promotion is performed on the operand, and the result has
the promoted type.

The result of the ~ operator is the bitwise complement of its
operand (that is, each bit in the result is set if and only if the
corresponding bit in the converted operand is not set). The integral
promotion is performed on the operand, and the result has the promoted
type. The expression ~E is equivalent to (ULONG_MAX-E) if E is
promoted to type unsigned long , to (UINT_MAX-E) if E is promoted to
type unsigned int.

As for the behavior of unary minus on an unsigned variable, I believe
it is fully defined by the following paragraph:

"A computation involving unsigned operands can
never overflow, because a result that cannot be represented by the
resulting unsigned integer type is reduced modulo the number that is
one greater than the largest value that can be represented by the
resulting unsigned integer type."

Thus -v "reduces to" UINT_MAX+1-v == UINT_MAX-(v-1)

"one's complement" and "sign magnitude" are concepts for signed
variables, they have no bearing on unsigned variables. Or did I
misunderstand what you wrote?

Regards.

Noob · Mar 21, 2013

Lew said:
Something about this reasoning troubles me; two somethings, actually.

First off, ~(v-1) is the canonical definition of 2's complement negation of
a value v. Your analysis seems to preclude the C standard's requirement
for "pure binary" representation of unsigned integers, and sign&magnitude
and ones-complement representation for signed integers, by defining
negation as a two's-complement operation. I'll have to spend some time
thinking about this point.

First, do you dispute the following statement?

"Considering a properly initialized unsigned int variable v,
-v and ~(v-1u) equivalent, for any and all value of v"

In other words, considering foo:
void foo(unsigned v) { assert(-v == ~(v-1u)); }
foo will never fail on any conforming implementation known
(or unknown) to mankind.

Second, as I replied to CBC, "one's complement", "two's complement",
"sign magnitude" are all concepts involving /signed/ integral types.
These notions do not come in play for /unsigned/ types, AFAIU.
(Do you dispute this?)

Secondly, your final derivation
~(v-1) <-> -v
no longer fits within the given parameters of unsigned integers, as there is
no viable representation /as an unsigned integer/ of a negative value.

This statement is confusing. Are you saying that
unsigned v = 42; v = -v;
has undefined behavior?

If you had stopped at ~(v-1) <-> (UINT_MAX + 1) - v
this would not be a problem, as neither side of this equation results in a
negative value.

Since "unsigned int" arithmetic is performed modulo (UINT_MAX+1),
UINT_MAX+1+u and u are equivalent (as in "they have the same value,
as far as the abstract machine is concerned")

Regards.

James Kuyper · Mar 21, 2013

Deal with reality. A ones complement/signed magnitude machine is only going to
have ones complement/signed magnitude adders, not that and twos complement
adders. If C is intended to be implemented is twos complement only, that should
be mentionned somewhere.

Two's complement is just as irrelevant to this issue as one's
complement. Unsigned math is required to be implemented without a sign
bit, as might be inferred from the name "unsigned", and that is, indeed
"mentioned somewhere" - specifically, 6.2.6.2p1. The distinction between
one's complement, two's complement, and sign-magnitude only applies only
to signed types, as described in 6.2.6.2p2. It's a useful coincidence
that adding 2's complement integers and an unsigned integers of the same
size can be done using the same adder, and that's part of the reason for
the popularity of 2's complement, but it's irrelevant to the question at
hand.

That's because, even on a platform with no native capability for
carrying out unsigned arithmetic, a conforming implementation of C is
required to emulate it. Any platform that allows implementation of C's
bit-wise operators has an instruction set powerful enough to permit such
emulation, though doing so would make unsigned types far slower than
signed ones. If emulated correctly, unsigned arithmetic will produce the
result specified in the OP's message, regardless of what representation
is used for signed integers.

Noob · Mar 21, 2013

China said:
Deal with reality. A ones complement/signed magnitude machine is
only going to have ones complement/signed magnitude adders, not
that and twos complement adders.

All the architectures I am familiar with implement two's complement
arithmetic. Can you name a few (modern) architectures that implement
something different?

If C is intended to be implemented is twos complement only,
that should be mentioned somewhere.

As James mentioned, C99 "6.2.6.2 Integer types" seems relevant.

Regards.

James Kuyper · Mar 21, 2013

On 03/21/2013 08:39 AM, James Kuyper wrote:
....

... It's a useful coincidence
that adding 2's complement integers and an unsigned integers of the same
size can be done using the same adder, ...

It would be more accurate to say that 2's complement representation was
designed specifically to make that relationship work.

glen herrmannsfeldt · Mar 21, 2013

James Kuyper said:
On 03/21/2013 08:13 AM, China Blue Clay wrote: (snip)

Two's complement is just as irrelevant to this issue as one's
complement. Unsigned math is required to be implemented without a sign
bit, as might be inferred from the name "unsigned", and that is, indeed
"mentioned somewhere" - specifically, 6.2.6.2p1. The distinction between
one's complement, two's complement, and sign-magnitude only applies only
to signed types, as described in 6.2.6.2p2. It's a useful coincidence
that adding 2's complement integers and an unsigned integers of the same
size can be done using the same adder, and that's part of the reason for
the popularity of 2's complement, but it's irrelevant to the question at
hand.

Note, though, that as someone mentioned, in the case of integer
promotion of smaller types, signed is still relevant.

That's because, even on a platform with no native capability for
carrying out unsigned arithmetic, a conforming implementation of C is
required to emulate it. Any platform that allows implementation of C's
bit-wise operators has an instruction set powerful enough to permit such
emulation, though doing so would make unsigned types far slower than
signed ones.

Maybe we really need a C implementation (probably C89) for the 7090.
At least in Fortran, the 7090 does 16 bit sign magnitude arithmetic
in 36 bit registers. I presume a C implementation would do the same.

The full 36 bits are used for floating point operations.

If emulated correctly, unsigned arithmetic will produce the
result specified in the OP's message, regardless of what
representation is used for signed integers.

Again, except for promotion from smaller sizes.

-- glen

James Kuyper · Mar 21, 2013

Note, though, that as someone mentioned, in the case of integer
promotion of smaller types, signed is still relevant.

That was me, in the message I posted with the header
Date: Tue, 19 Mar 2013 12:10:49 -0400

....

Again, except for promotion from smaller sizes.

I pointed out that the result would be different (and representation
dependent) if an unsigned type with TYPE_MAX < INT_MAX were used, but
that wasn't a exception to OP's comments, since he was explicitly
limiting them to unsigned int, where that's not a possibility. I was
merely showing him the limits to the validity of his argument; it was a
valid argument, in the actual context where he used it.

glen herrmannsfeldt · Mar 21, 2013

(snip, I wrote)

I pointed out that the result would be different (and representation
dependent) if an unsigned type with TYPE_MAX < INT_MAX were used, but
that wasn't a exception to OP's comments, since he was explicitly
limiting them to unsigned int, where that's not a possibility. I was
merely showing him the limits to the validity of his argument; it was a
valid argument, in the actual context where he used it.

Still, someone could write it with int, someone comes along later
and changes it to short, then it fails.

OK, what are sizeof(short) and sizeof(int) on the Unisys machines?

The CDC ones complement machines are a little out of date now, but
it might be that someone still has one running. (There was a time where
someone would run some old machines one day a week to save on power
costs.)

As I noted, the 7090 likely would have a 16 bit C int type, and maybe
even a 16 bit C char. (As six isn't enough.) In that case, your comment
wouldn't apply.

-- glen

Phil Carmody · Mar 21, 2013

Lew Pitcher said:
Something about this reasoning troubles me; two somethings, actually.

First off, ~(v-1) is the canonical definition of 2's complement negation of
a value v. Your analysis seems to preclude the C standard's requirement
for "pure binary" representation of unsigned integers, and sign&magnitude
and ones-complement representation for signed integers, by defining
negation as a two's-complement operation. I'll have to spend some time
thinking about this point.

Secondly, your final derivation
~(v-1) <-> -v
no longer fits within the given parameters of unsigned integers, as there is
no viable representation /as an unsigned integer/ of a negative value. If
you had stopped at
~(v-1) <-> (UINT_MAX + 1) - v
this would not be a problem, as neither side of this equation results in a
negative value.

I'm not familiar with any architecture where for unsigned int v, -v results
in a negative value.

Phil

James Kuyper · Mar 21, 2013

On 03/21/2013 05:34 PM, glen herrmannsfeldt wrote:
....

As I noted, the 7090 likely would have a 16 bit C int type, and maybe
even a 16 bit C char. (As six isn't enough.) In that case, your comment
wouldn't apply.

You're right - in that case there couldn't be any unsigned type with
TYPE_MAX < INT_MAX, which I explicitly noted as a precondition for the
applicability of my comment.

Phil Carmody · Mar 21, 2013

James Kuyper said:
That was me, in the message I posted with the header
Date: Tue, 19 Mar 2013 12:10:49 -0400

Indeed, good catch for the generalisation of the question.
However, wouldn't a simple '1u' rather than '1' fix that case?

Phil

glen herrmannsfeldt · Mar 21, 2013

James Kuyper said:
On 03/21/2013 05:34 PM, glen herrmannsfeldt wrote:
...

You're right - in that case there couldn't be any unsigned type with
TYPE_MAX < INT_MAX, which I explicitly noted as a precondition for the
applicability of my comment.

As far as I know, there weren't any sign magnitude binary machines since
the 7094.

The S/360 fixed decimal form is sign magnitude, still supported in
current generation z/ machines. Not so useful for C, though.

There have been some machines using the same representation for integer
and floating point, the B5500 for example. With the appropriate
normalization and exponent representation it works.

Probably also no C compilers for them.

-- glen

Semantics of unary minus	13	Feb 12, 2006
Overloading unary minus for use in function calls	6	Dec 20, 2006
bitwise decimal operators	2	Apr 1, 2005
setjmp and undefined behaviour	2	Nov 30, 2013
clc selected threads (30-jan-2005 to 31-jan-2005) #1	3	Feb 6, 2005
Operator ~ with complex arguments	5	Jul 9, 2008
On alignment (final committee draft for C++0x and n1425 for C1X)	2	Aug 21, 2010
On alignment (final committee draft for C++0x and n1425 for C1X)	6	Aug 20, 2010

Unary minus and bitwise complement

Noob

Eric Sosman

Bart van Ingen Schenau

James Kuyper

James Kuyper

Lew Pitcher

glen herrmannsfeldt

army1987

Noob

Noob

James Kuyper

Noob

James Kuyper

glen herrmannsfeldt

James Kuyper

glen herrmannsfeldt

Phil Carmody

James Kuyper

Phil Carmody

glen herrmannsfeldt

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads