Direct computation of integer limits in K&R2?

P

Peter Nilsson

Ark Khasin said:
Peter said:
Quoting pete:

  #if !(1 & -1)
      printf("ones complement\n");
  #elif -1 & 2
      printf("twos complement\n");
  #else
      printf("sign magnitude\n");
  #endif

Pete asked if greycode or other weird representations
could be used for negative integers, but it seems that
is not so, despite the loose wording of C90.
[cf <http://groups.google.com/group/comp.std.c/msg/5f332b9aa22b92ec>]

Is there any assurance that the representation of integer
constants in the preprocessor is in any way related to the
representation of integer objects
- falls into one of the three models of representation of
integer objects
?

Notionally, yes.

C89 draft 3.8.1

The resulting tokens comprise the controlling constant
expression which is evaluated according to the rules of
$3.4 using arithmetic that has at least the ranges
specified in $2.2.4.2, except that int and unsigned int
act as if they have the same representation as,
respectively, long and unsigned long .

There's similar wording for C99 though intmax_t and uintmax_t
are used.

Unfortunately, many implementations are somewhat inconsistent.
Consider...

#include <limits.h>
#include <stdio.h>

int main(void)
{
printf("ULONG_MAX = %lu\n", ULONG_MAX);

#if -1 == -1ul
puts("-1 == -1ul [pre]");
#endif

if (-1 == -1ul)
puts("-1 == -1ul");

#if 4294967295 == -1ul
puts("4294967295 == -1ul [pre]");
#endif

if (4294967295 == -1ul)
puts("4294967295 == -1ul");

return 0;
}

The output for me using delorie gcc 4.2.1 with -ansi
-pedantic is...

ULONG_MAX = 4294967295
-1 == -1ul [pre]
-1 == -1ul
4294967295 == -1ul

As you can see, there is a discrepancy between the way
that preprocessor arithmetic is evaluated. Fact is,
gcc is not the only compiler to show problems.
[Of course the above can be repaired so as to not use the
preprocessor. Just asking...]

Indeed, using the expressions in a normal 'if' would be
the go. There's no reason why say int couldn't use 2c but
long uses 1c.
 
M

Micah Cowan

6.5.6.2p2 says ("the first two" below are sign-and-magnitude and
two's complement):

"Which of these applies is implementation-defined, as is whether the
value with sign bit 1 and all value bits zero (for the first two),
or with sign bit and all value bits 1 (for ones' complement), is a
trap representation or a normal value."

Huh. I managed to forget that somehow. My bad, Flash.
 
A

Ark Khasin

Peter said:
Ark Khasin said:
Peter said:
Quoting pete:

#if !(1 & -1)
printf("ones complement\n");
#elif -1 & 2
printf("twos complement\n");
#else
printf("sign magnitude\n");
#endif

Pete asked if greycode or other weird representations
could be used for negative integers, but it seems that
is not so, despite the loose wording of C90.
[cf <http://groups.google.com/group/comp.std.c/msg/5f332b9aa22b92ec>]
Is there any assurance that the representation of integer
constants in the preprocessor is in any way related to the
representation of integer objects
- falls into one of the three models of representation of
integer objects
?

Notionally, yes.

C89 draft 3.8.1

The resulting tokens comprise the controlling constant
expression which is evaluated according to the rules of
$3.4 using arithmetic that has at least the ranges
specified in $2.2.4.2, except that int and unsigned int
act as if they have the same representation as,
respectively, long and unsigned long .

There's similar wording for C99 though intmax_t and uintmax_t
are used.

Unfortunately, many implementations are somewhat inconsistent.
Consider...

#include <limits.h>
#include <stdio.h>

int main(void)
{
printf("ULONG_MAX = %lu\n", ULONG_MAX);

#if -1 == -1ul
puts("-1 == -1ul [pre]");
#endif

if (-1 == -1ul)
puts("-1 == -1ul");

#if 4294967295 == -1ul
puts("4294967295 == -1ul [pre]");
#endif

if (4294967295 == -1ul)
puts("4294967295 == -1ul");

return 0;
}

The output for me using delorie gcc 4.2.1 with -ansi
-pedantic is...

ULONG_MAX = 4294967295
-1 == -1ul [pre]
-1 == -1ul
4294967295 == -1ul

As you can see, there is a discrepancy between the way
that preprocessor arithmetic is evaluated. Fact is,
gcc is not the only compiler to show problems.
[Of course the above can be repaired so as to not use the
preprocessor. Just asking...]

Indeed, using the expressions in a normal 'if' would be
the go. There's no reason why say int couldn't use 2c but
long uses 1c.

Wow. I haven't thought of this possibility. It would imply malicious
intent of the implementer for it have an overhead in simplest
conversions. And it won't be for all markets :)
But I don't see why such a horror implementation would be illegal.
Thank you for the example.
-- Ark
 
U

user923005

Yes. Unlike C99, unsigned to signed integer conversion
is implementation defined without the possibility of
raising a signal. So...

INT_MIN isn't computed per se, rather it's derived by
determining the representation for negative ints. [I
know pete posted some very simple constant expressions,
though it was some time ago.]

Would you say that this exercise is overly complex for that point in
K&R2?

I will be pretty amazed to see anyone write a portable solution that
does it all (floating point is also requested).
I guess that signed integer <TYPE>_MIN values will be hard to come up
with.

Will computation of DBL_MAX signal a floating point exception?

I guess that it is the hardest exercise in the whole book, by far.
 
F

Flash Gordon

Micah Cowan wrote, On 12/03/08 00:30:
It's only allowed to be a trap representation on _non_ two's
complement representations. sign bit = 1 and all value bits = 0 (and
padding bits at non-trap values) would necessarily be the minimum
representable value.

Wrong. The C standard explicitly allows for it to be a trap
representation on two's complement representations. Quoting from N1256...

| ... If the sign bit is one, the value shall be modiï¬ed in one of
| the following ways:
| — the corresponding value with sign bit 0 is negated (sign and
| magnitude);
| — the sign bit has the value −(2 N ) (two’s complement);
| — the sign bit has the value −(2 N − 1) (ones’ complement ).
| Which of these applies is implementation-deï¬ned, as is whether the
| value with sign bit 1 and all value bits zero (for the ï¬rst two), or
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| with sign bit and all value bits 1 (for ones’ complement), is a trap
| representation or a normal value. In the case of sign and magnitude
| and ones’ complement, if this representation is a normal value it is
| called a negative zero.

two's complement is one of the first two.

The above is from section 6.2.6.2 para 2.
 
F

Flash Gordon

Micah Cowan wrote, On 12/03/08 05:55:
Huh. I managed to forget that somehow. My bad, Flash.

It's easy to forget. I'm not actually aware of any implementations which
make use of this freedom.
 
K

Kaz Kylheku

Hello all,

In K&R2 one exercise asks the reader to compute and print the limits for
the basic integer types. This is trivial for unsigned types. But is it
possible for signed types without invoking undefined behaviour
triggered by overflow? Remember that the constants in limits.h cannot
be used.

You can use shifting to determine how many bits there are in the given
signed integral type. Start with 1 and keep shifting it left until it
drops off. With that information, you can construct the greatest
possible positive integer value: which is all 1's except for the sign
bit, which is zero. The greatest possible negative value is either the
additive inverse of that value, or, in the case of two's complement,
that value less one. You can detect whether two's complement is in
effect by applying a simple test to the value -1:

switch (-1 & 3) {
case 1: /* ...01: sign magnitude */
break;
case 2: /* ...10: one's complement */
break;
case 3: /* ...11: two's complement */
break;
}

That's the general approach I'd take to the exercise.
 
Y

ymuntyan

You can use shifting to determine how many bits there are in the given
signed integral type. Start with 1 and keep shifting it left until it
drops off.

That's UB, no?
With that information, you can construct the greatest
possible positive integer value: which is all 1's except for the sign
bit, which is zero. The greatest possible negative value is either the
additive inverse of that value, or, in the case of two's complement,
that value less one.

And this may be a trap representation.

Yevgen
 
C

CBFalconer

Ark said:
Peter Nilsson wrote:
.... snip ...
Unfortunately, many implementations are somewhat inconsistent.
Consider...

#include <limits.h>
#include <stdio.h>

int main(void) {
printf("ULONG_MAX = %lu\n", ULONG_MAX);

#if -1 == -1ul
puts("-1 == -1ul [pre]");
#endif

if (-1 == -1ul)
puts("-1 == -1ul");

#if 4294967295 == -1ul
puts("4294967295 == -1ul [pre]");
#endif

if (4294967295 == -1ul)
puts("4294967295 == -1ul");
return 0;
}

The output for me using delorie gcc 4.2.1 with -ansi
-pedantic is...

ULONG_MAX = 4294967295
-1 == -1ul [pre]
-1 == -1ul
4294967295 == -1ul

As you can see, there is a discrepancy between the way
that preprocessor arithmetic is evaluated. Fact is,
gcc is not the only compiler to show problems.

What's wrong with that, remembering that (for gcc, on xx86's) a
long is defined to be identical to an int.
 
C

CBFalconer

Kaz said:
You can use shifting to determine how many bits there are in the
given signed integral type. Start with 1 and keep shifting it
left until it drops off. With that information, you can construct
....

No, because the moment it 'drops off' you have run into
implementation (or undefined) behaviour. You can't write portable
code to do this. You can possibly write code that executes on YOUR
machinery.
 
P

Peter Nilsson

Ark Khasin said:
Wow. I haven't thought of this possibility. It would imply
malicious intent of the implementer

Not necessarily. Non-normalised floating point is one way of
implementing 1c integers. I could imagine that large integers
might theoretically be implemented this way.

[I remember old Macs supporting 80 and 96-bit floating point
types. Such types could be used to implement 64-bit integers
on 16/32 bit machines.]
for it have an overhead in simplest conversions.

Nevertheless such design decisions are sometimes made.
And it won't be for all markets :) But I don't see why such
a horror implementation would be illegal.

Horror or not, it's just another virtual C machine to me. ;)
 
M

Malcolm McLean

santosh said:
Hello all,

In K&R2 one exercise asks the reader to compute and print the limits for
the basic integer types. This is trivial for unsigned types. But is it
possible for signed types without invoking undefined behaviour
triggered by overflow? Remember that the constants in limits.h cannot
be used.
I don't think there's a perfect answer.

However this is the closest I could get.

double x = 0;
int testme;

do
{
x++;
testme = (int) x;
} while((double) testme == x);

printf("Biggest integer %g\n", x - 1);

It will fail if all ints are not exactly representable by a double. Which is
an int 64 machine.
(Wail, gnash).
 
K

Kaz Kylheku

That's UB, no?

Unfortunately it is. Shifting a bit into the sign is UB. Only a
positive value whose double is representable may be shifted left by
one bit.

This means that the sign bit is quite impervious to bit manipulation.
 
B

Ben Bacarisse

Malcolm McLean said:
I don't think there's a perfect answer.

However this is the closest I could get.

double x = 0;
int testme;

do
{
x++;
testme = (int) x;
} while((double) testme == x);

printf("Biggest integer %g\n", x - 1);

I don't think you need to be so cautious -- ints must use binary, so
you could start at 1 and repeatedly double x and try to convert x-1.
Even so, you have not gained anything -- the conversion to int, when
it is out of range, is still undefined.
 
B

Ben Bacarisse

Kaz Kylheku said:
Unfortunately it is. Shifting a bit into the sign is UB. Only a
positive value whose double is representable may be shifted left by
one bit.

This means that the sign bit is quite impervious to bit
manipulation.

It must participate in other bit operations, though, like ~, &, | and
^. Even so,I can't see any way to avoid UB when trying to calculate
the range of int. Equally, I don't have a persuasive argument that it
*can't* be done, either.
 
U

user923005

It must participate in other bit operations, though, like ~, &, | and
^.  Even so,I can't see any way to avoid UB when trying to calculate
the range of int.  Equally, I don't have a persuasive argument that it
*can't* be done, either.

To compound things, imagine a C implementation where all integral
types were 64 bits (including char).
Even the undefined behavior hacks I posted will fail on those.
In short, I think it is a really difficult problem to solve.
If someone can define a sensible solution, I would be very pleased to
see it.
It might be interesting to see what DMR has to say about it.
 
I

Ioannis Vranos

santosh said:
Hello all,

In K&R2 one exercise asks the reader to compute and print the limits for
the basic integer types. This is trivial for unsigned types. But is it
possible for signed types without invoking undefined behaviour
triggered by overflow? Remember that the constants in limits.h cannot
be used.


C95:

#include <stdio.h>


int main()
{
unsigned x= -1;

int INTMAX=x /2;

int INTMIN= -INTMAX -1;

printf("INTMIN: %d\t", INTMIN);

printf("INTMAX: %d\n", INTMAX);

return 0;
}
 
R

Richard Heathfield

Peter Nilsson said:
What if UINT_MAX == INT_MAX,

I don't think it can. "For each of the signed integer types, there is a
corresponding (but different) unsigned integer type (designated with the
keyword unsigned) that uses the same amount of storage (including sign
information) and has the same alignment requirements. The range of
nonnegative values of a signed integer type is a subrange of the
corresponding unsigned integer type, and the representation of the same
value in each type is the same." So for UINT_MAX to be == INT_MAX, ints
would need to squeeze twice as many values into the same number of bits as
unsigned ints.
or UINT_MAX = 4*INT_MAX+3?

See above.
What if INT_MIN == -INT_MAX?

That, however, is a valid objection. For example, INT_MIN might be -32767
rather than -32768.
 
I

Ioannis Vranos

Richard said:
Peter Nilsson said:


I don't think it can. "For each of the signed integer types, there is a
corresponding (but different) unsigned integer type (designated with the
keyword unsigned) that uses the same amount of storage (including sign
information) and has the same alignment requirements. The range of
nonnegative values of a signed integer type is a subrange of the
corresponding unsigned integer type, and the representation of the same
value in each type is the same." So for UINT_MAX to be == INT_MAX, ints
would need to squeeze twice as many values into the same number of bits as
unsigned ints.


See above.


That, however, is a valid objection. For example, INT_MIN might be -32767
rather than -32768.



C95:

Since sizeof(N)= sizeof(signed N)= sizeof(unsigned N)

where N can be char, short, int, long


and as you mentioned they use the same amount of storage, how can
INT_MIN be equal to INT_MAX since the range of values is the same.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,053
Latest member
BrodieSola

Latest Threads

Top