Detecting overflows while computing off_t

A

Andre Majorel

How do you compute an off_t with overflow detection ?

Ideally, the target language is C89/C90 and the target platform
is reasonably recent versions of the major Unixen. If there is
no practical way to do that without limiting the target platform
set to FreeBSD + Linux + NetBSD + OpenBSD or adding the
requirement of conformance to some combination of SUS v2, SUS v3
and C99, I'll settle for that.

Overflow-safe versions of + and * would do but not even a
relatively recent standard like SUS v2 provides OFF_MIN and
OFF_MAX and since off_t is a signed integer type, overflows lead
to undefined behaviour.

What are we supposed to do ? Compare sizeof (off_t) with sizeof
(int/long/long long[1]) and use INT_MAX/LONG_MAX/LLONG_MAX[1] ?
Carry the calculation with doubles, cast to off_t and cast back
to double for verification[2] ? Use bignum ?

[1] Neither of which is not in C89 or SUS v2, by the way.
[2] But is the result of casting a double outside of
INT_MIN..INT_MAX to int defined ?
 
K

Keith Thompson

Andre Majorel said:
How do you compute an off_t with overflow detection ?

There is no off_t type defined in the standard C library. But IMHO
your question is still topical.

More generally, given a type off_t (a typedef for some signed integer
type, defined in some implementation-specific header), how can you do
computations in that type with overflow detection?

That's a fairly tricky question, especially if you don't have OFF_MIN
and OFF_MAX macros specifying the lower and upper bounds.

[...]
Overflow-safe versions of + and * would do but not even a
relatively recent standard like SUS v2 provides OFF_MIN and
OFF_MAX and since off_t is a signed integer type, overflows lead
to undefined behaviour.

What are we supposed to do ? Compare sizeof (off_t) with sizeof
(int/long/long long[1]) and use INT_MAX/LONG_MAX/LLONG_MAX[1] ?
Carry the calculation with doubles, cast to off_t and cast back
to double for verification[2] ? Use bignum ?

If I wanted portability, I wouldn't use double. For example, if both
off_t and double are 64 bits, double won't be able to represent all
values of type off_t (it wastes bits on that silly exponent thingie);
long double might, but there's no guarantee that *any* floating-point
type can represent all values of a given integer type.

I don't think there's any 100% portable way to determine the bounds.
You can probably get away with comparing sizeof(off_t) to sizeof(int),
etc.; if you find a type whose size matches, you can *assume* that its
bounds are the same as those of that type. It's not impossible that
that assumption will break if there are padding bits. (I think that
Cray vector machines have padding bits for some integer types; I'll
check the details later, when I get a chance).

Or you might have a system-specific header that defines OFF_MIN and
OFF_MAX, with a requirement that the header be modified for each
target system.

Once you've done that, you can check the values of the operands before
performing the operation. For (x + y):
If the signs differ, or either operand is zero, you're ok.
If both operands are positive and x <= OFF_MAX - y, you're ok.
If both operands are negative and (...), you're ok.
Otherwise, the operation will overflow.

The "(...)" above is left as an exercise (I'm too lazy to work it out).

Multiplication is similar but more complicated. Subtraction is
addition using the negation of one of the operands. Division can
overflow only in the case of OFF_MIN / -1, and only if OFF_MIN <
-OFF_MAX.

This is going to slow things down quite a lot.

An implementation is free to define the behavior of integer overflow.
If your implementation does so, and you don't mind losing portability,
you can take advantage of that. On many implementations, overflow
quietly wraps around; you can check the result against the operands
rather than pre-checking the operands. For example, if both operands
are positive and the result does not exceed both operands, you had an
overflow. Multiplication is trickier.
 
R

Richard Tobin

Andre Majorel said:
How do you compute an off_t with overflow detection ?

A sometimes useful fact, if you know that overflow behaves as addition
mod 2^N (where N is the size in bits), is that a+b overflows if and
only if a+b < a (for positive a and b). So you can do the addition
and check for overflow by comparing the result to either of the
operands. For unsigned integer types in C, overflow must behave this
way. For signed types, it is still true for most implementations.

-- Richard
 
P

pete

Richard said:
A sometimes useful fact, if you know that overflow behaves as addition
mod 2^N (where N is the size in bits), is that a+b overflows if and
only if a+b < a (for positive a and b). So you can do the addition
and check for overflow by comparing the result to either of the
operands. For unsigned integer types in C, overflow must behave this
way. For signed types, it is still true for most implementations.

Overflow for signed types is undefined behavior
and allowing it to happen is not the way to write a correct program.

If A and B have opposites signs
or at least one of them is equal to zero, then (A+B) won't overflow.
If A and B are positive, and A_MAX - b > a, then (A+B) won't overflow.
If A and B are negative, and a > A_MIN - b, then (A+B) won't overflow.
 
B

Ben Pfaff

A sometimes useful fact, if you know that overflow behaves as addition
mod 2^N (where N is the size in bits), is that a+b overflows if and
only if a+b < a (for positive a and b). So you can do the addition
and check for overflow by comparing the result to either of the
operands. For unsigned integer types in C, overflow must behave this
way. For signed types, it is still true for most implementations.

Compilers can actually screw you over here, because they may make
optimizations based on the assumption that signed arithmetic does
not overflow. There's currently a big discussion of this in GCC
on the gnulib mailing list:
http://thread.gmane.org/gmane.comp.lib.gnulib.bugs/8152/focus=8152
 
A

Andre Majorel

Andre Majorel said:
Overflow-safe versions of + and * would do but not even a
relatively recent standard like SUS v2 provides OFF_MIN and
OFF_MAX and since off_t is a signed integer type, overflows lead
to undefined behaviour.

What are we supposed to do ? Compare sizeof (off_t) with sizeof
(int/long/long long[1]) and use INT_MAX/LONG_MAX/LLONG_MAX[1] ?
Carry the calculation with doubles, cast to off_t and cast back
to double for verification[2] ? Use bignum ?

If I wanted portability, I wouldn't use double. For example, if both
off_t and double are 64 bits, double won't be able to represent all
values of type off_t

Should have thought of that.
I don't think there's any 100% portable way to determine the bounds.
You can probably get away with comparing sizeof(off_t) to sizeof(int),
etc.; if you find a type whose size matches, you can *assume* that its
bounds are the same as those of that type. It's not impossible that
that assumption will break if there are padding bits. (I think that
Cray vector machines have padding bits for some integer types; I'll
check the details later, when I get a chance).

Thanks. It's pretty sad to have to resort to autoconf-style
hacks to be able to use standard features on a conforming
implementation (standard = SUS, not ISO 9899).

SUS 3 still doesn't appear to define OFF_MIN/OFF_MAX.

Merry Christmas to all.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top