Putting logical limitations on types

Ido Yehieli · Jan 25, 2008

Hi,
can I define types with a limitation on the data the type contain,
rather than just the size it will have in memory?

For example: a bound_int which can only be in a certain range, so
don't i don't have to have a bound checking function like this one:

int in_range(int i, int min, int max){
----if(i<min)
--------return min;
----if(i>max)
--------return max;
}

Or a prime_number type, and so on.

Thanks,
Ido.

Eric Sosman · Jan 25, 2008

Ido said:
Hi,
can I define types with a limitation on the data the type contain,
rather than just the size it will have in memory?

For example: a bound_int which can only be in a certain range, so
don't i don't have to have a bound checking function like this one:

int in_range(int i, int min, int max){
----if(i<min)
--------return min;
----if(i>max)
--------return max;
}

Or a prime_number type, and so on.

A variable of type T can store any value in T's range,
without restriction. There is no way you can limit the
values to a subrange or subset:

int hour; /* 0 through 23, please */
hour = 42; /* nothing prevents this */

int odd; /* odd numbers only, please */
odd = 42; /* nothing prevents this */

Ben Pfaff · Jan 25, 2008

Ido Yehieli said:
can I define types with a limitation on the data the type contain,
rather than just the size it will have in memory?

Not in C.

Walter Roberson · Jan 25, 2008

Ido Yehieli said:
can I define types with a limitation on the data the type contain,
rather than just the size it will have in memory?

For example: a bound_int which can only be in a certain range, so
don't i don't have to have a bound checking function

No, not in C.

Note: C enumerations ("enum") are suggestive of what you are looking
for, but they do NOT prevent other values from being stored.
Effectively they just give names for integral constants, rather
than controlling the data type.

Ido Yehieli · Jan 25, 2008

int hour; /* 0 through 23, please */
hour = 42; /* nothing prevents this */

That's unfortunate. Thanks anyway.

Keith Thompson · Jan 25, 2008

Ido Yehieli said:
can I define types with a limitation on the data the type contain,
rather than just the size it will have in memory?

For example: a bound_int which can only be in a certain range, so
don't i don't have to have a bound checking function like this one:

int in_range(int i, int min, int max){
----if(i<min)
--------return min;
----if(i>max)
--------return max;
}

Or a prime_number type, and so on.

No, C doesn't provide a standard way to do that (and I've never heard
of an implementation that provides it as an extension).

You can define an abstract data type, implemented as a struct, or a
void*, or whatever, and enforce full control of what can be done with
the type. But this is tedious. If you want addition, for example,
you have to provide your own add() function (C doesn't support
user-defined operator overloading).

If you really want this, you might consider using a language other
than C (for example, a C++ class can do much of this kind of thing
more conveniently than you can do it in C). If you choose a C
implementation that provides extensions such as operator overloading,
be aware that you're also in effect using a language other than C;
your code won't be portable to other C implementations. That may or
may not be a problem for you.

Eric Sosman · Jan 25, 2008

Ido said:
That's unfortunate. Thanks anyway.

Perhaps. But if you think about it for a while, you may
realize that the notion of a range-restricted sub-type would
complicate the language enormously. For example, what ought
to happen in the second line of the example above (assuming
the first line somehow expressed the 0..23 range limitation)?
Should the program silently store 23, or should it store 18
(42 mod 24), or should it crash, or ...? Should it be legal
to use an ordinary int* to point at a (0..23)int variable?
(If yes, how do you get scanf() to respect the range limit?)
Should it be legal to assign a (0..23)int* to a (-42..42)int*?
What rules should govern the division of a (-42..42)int by a
(1..16)int: For example, what is the type of the quotient?

C as it stands allows you to write any predicates you like
and to choose the behavior if a predicate isn't satisfied.
However, it requires that you write these things as functions;
C will not insert them for you at every reference to a variable.

Flash Gordon · Jan 25, 2008

Ido Yehieli wrote, On 25/01/08 21:56:

Hi,
can I define types with a limitation on the data the type contain,
rather than just the size it will have in memory?

For example: a bound_int which can only be in a certain range, so
don't i don't have to have a bound checking function like this one:

int in_range(int i, int min, int max){

<snip>

Not in C you can't. If you want to do that sort of thing you will have
to move to another language and consider what you want to happen if an
attempt is made to assign an illegal value. Pascal certainly has some of
the features you want, as does Modula 2 I believe. You might be able to
implement something of the sort in C++ using operator overloading, but
that would be a question for comp.lang.c++ to answer.

If you want to discus the best alternatively language for your
requirements comp.programming *might* be an appropriate place.
Alternatively, if you want to explain more about what you want to
achieve it might be that someone can suggest a completely different
approach using C to solve your problem.

Ido Yehieli · Jan 25, 2008

If you want to discus the best alternatively language for your
requirements comp.programming *might* be an appropriate place.
Alternatively, if you want to explain more about what you want to
achieve it might be that someone can suggest a completely different
approach using C to solve your problem.

Never mind, I'll just use a function to check the bounds when needed.
I thought there might be a more appropriate way of doing it.

-Ido,

Ben Bacarisse · Jan 26, 2008

Ido Yehieli said:
can I define types with a limitation on the data the type contain,
rather than just the size it will have in memory?

For example: a bound_int which can only be in a certain range

Or a prime_number type, and so on.

You've had the C answer: no, but it is worth pointing out that a
language that the full version of what you are hinting at with a "prime
number" type would either have to abandon compile time types or the
compiler would need a complex theorem prover in it (and the compiler
might end up not guaranteed to terminate). That is far cry from C.

Ido Yehieli · Jan 26, 2008

a
language that the full version of what you are hinting at with a "prime
number" type would either have to abandon compile time types or the
compiler would need a complex theorem prover in it

I guess a lazy-evaluating language could provide such a type.
But then a prime_number type is just syntactic sugar around a function
that generates prime numbers.

-Ido.

santosh · Jan 26, 2008

Ido said:
I guess a lazy-evaluating language could provide such a type.
But then a prime_number type is just syntactic sugar around a function
that generates prime numbers.

What is a prime number /type/ supposed to do? Signal an error when a
non-prime number is written to it?

Ido Yehieli · Jan 26, 2008

What is a prime number /type/ supposed to do? Signal an error when a
non-prime number is written to it?

Not the best example, is it?
I guess it should do the same thing an char does when you assign a
really big integer to it.

-Ido.

santosh · Jan 26, 2008

Ido said:
Not the best example, is it?
I guess it should do the same thing an char does when you assign a
really big integer to it.

If char is unsigned then wrap-around occurs on assigning an out of range
value. If char is signed then undefined behaviour results on overflow.
In either case, the result is not, as far as I can see, useful.

I guess you meant to say that the value to be assigned would be checked
against the range of the object to be assigned to, and if out of range,
an error would be raised at compile or runtime.

While this is difficult but possible in C, it's generally not present in
the vast majority of implementations. Most compilers would give you a
warning when assigning out of range values is detectable during
compilation, but the runtime usually has no such protection.

You /can/ create a type and define a suite of functions for performing
I/O and arithmetic on that type, but still the language cannot enforce
protection.

Malcolm McLean · Jan 26, 2008

Eric Sosman said:
Perhaps. But if you think about it for a while, you may
realize that the notion of a range-restricted sub-type would
complicate the language enormously. For example, what ought
to happen in the second line of the example above (assuming
the first line somehow expressed the 0..23 range limitation)?
Should the program silently store 23, or should it store 18
(42 mod 24), or should it crash, or ...? Should it be legal
to use an ordinary int* to point at a (0..23)int variable?
(If yes, how do you get scanf() to respect the range limit?)
Should it be legal to assign a (0..23)int* to a (-42..42)int*?
What rules should govern the division of a (-42..42)int by a
(1..16)int: For example, what is the type of the quotient?

We could add valid ranges to C without complicating things. Simply declare

int i 0 to N;

(means that i can hold any integer between zero and N, assuming N is a
variable in scope). There are no other changes to the language. If the
restraint is violated, beahviour is undefined. Which means that a decent
compiler will give you a crash, a bad one will plough on regardless. Often
the strategy adopted will be to abandon any checks for the final release,
hence you have a powerful debugging tool with no impact on performance.

The problem is that to honour the restrictions, the compiler needs to check
every assignment. This isn't particularly complicated, but will
approximately halve the speed of the program.

Ido Yehieli · Jan 26, 2008

This isn't particularly complicated, but will
approximately halve the speed of the program.

How did you figure it will be 1/2?
I'm not saying it wouldn't, just interested to see where this estimate
came from.

-Ido.

Ben Bacarisse · Jan 26, 2008

We could add valid ranges to C without complicating things. Simply declare

int i 0 to N;

(means that i can hold any integer between zero and N, assuming N is a
variable in scope).

The problem is that to honour the restrictions, the compiler needs to
check every assignment. This isn't particularly complicated, but will
approximately halve the speed of the program.

To honour the restriction it has to re-implement pointers in a new
way, too, or ban the usual permission to convert pointers.

memcpy(&i, &j, sizeof(int 0 to N)); /* OK or not? */

Bart C · Jan 26, 2008

Ido Yehieli said:
On Jan 26, 10:59 am, "Malcolm McLean" <[email protected]> ....

int i 0 to N, j 0 to N;

A bit untidy. Best the range belongs with the int, but the syntax becomes
awkward unless you put it into a typedef.

And it surely will complicate nearly everything (from a language/compiler
point of view).

How did you figure it will be 1/2?
I'm not saying it wouldn't, just interested to see where this estimate
came from.

A reduction of 50% is extremely unlikely, except in a program doing doing
nothing else but assigning normal ints to your bounded int. In:

int i 0 to N;
int j 0 to N;
int k;

Then:

i=j; /* no penalty */
i=k; /* penalty, assuming language allows this without a cast */
k=i; /* no penalty */
++j; / * penalty for oveflow checking*/
i=j+k; /* penalty */

And so on. But mixed with everything else I'm guessing at a slowdown far
less than 10%. Probably not significant. And if you were going to do your
own checking anyway using functions, that will take at least as long.

If the value of N above is not known to the compiler, this will hit
performance a little more, as in:

int i A to B;
int j C to D;

i=j; /* Compiler doesn't know if C..D is a subset of A..B or not */

However the language doesn't have this at the minute.

Army1987 · Jan 26, 2008

Malcolm said:
We could add valid ranges to C without complicating things. Simply declare

int i 0 to N;

(means that i can hold any integer between zero and N, assuming N is a
variable in scope).

That'd be already complicated enough if N were required to be a constant
expression, let alone if it could be a variable.

There are no other changes to the language.

Either to becomes a keyword, breaking any program using it as an
identifier, or you figure out some other syntax for that.

Michal Nazarewicz · Jan 26, 2008

Army1987 said:
That'd be already complicated enough if N were required to be a constant
expression, let alone if it could be a variable.

Either to becomes a keyword, breaking any program using it as an
identifier, or you figure out some other syntax for that.

A C++ template syntax?

int<0, 42> foo;

Come to think of it, such restricted integers could be implemented in
C++ so if it's really important OP could switch to C++ compiler.

ftplib limitations?	9	Aug 24, 2010
Limitations and workarounds to expressions defining size of an array	44	Sep 4, 2012
Sort from an array of objects into stacks without using java.util	0	Jul 6, 2021
Limitations on dataset types	4	Jun 14, 2012
C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
How do I call the exact function on page load?	6	May 19, 2023
Any Ideas On This Simple Co-Prime program	1	Dec 12, 2022
C99 integer types	24	Jul 29, 2012

Putting logical limitations on types

Ido Yehieli

Eric Sosman

Ben Pfaff

Walter Roberson

Ido Yehieli

Keith Thompson

Eric Sosman

Flash Gordon

Ido Yehieli

Ben Bacarisse

Ido Yehieli

santosh

Ido Yehieli

santosh

Malcolm McLean

Ido Yehieli

Ben Bacarisse

Bart C

Army1987

Michal Nazarewicz

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads