Why no/limited checking on enum values ?

N

Nebula

Consider
enum Side {Back,Front,Top,Bottom};
enum Side a;
Now, why is
a = 124;
legal (well it really is an integer, but still,
checking could be performed..) ?
Wouldn't enums be more useful if there was a
bit more typechecking ?
 
R

Richard Bos

Nebula said:
Consider
enum Side {Back,Front,Top,Bottom};
enum Side a;
Now, why is
a = 124;
legal (well it really is an integer, but still,
checking could be performed..) ?
Wouldn't enums be more useful if there was a
bit more typechecking ?

Good question. It never has been illegal, AFAIK; it's certainly legal
under both Standards. The original reason probably is that it's not
always possible to do this checking at compile-time, leading to run-time
checks which have in C always been regarded (not without reason) as
being too costly. There's a way around that by making enums _not_ be
really an integer under the hood, but that would probably invalidate
some existing programs, which is something the Standard Committee has
always been (arguably too) wary of doing.

Richard
 
X

xarax

Nebula said:
Consider
enum Side {Back,Front,Top,Bottom};
enum Side a;
Now, why is
a = 124;
legal (well it really is an integer, but still,
checking could be performed..) ?
Wouldn't enums be more useful if there was a
bit more typechecking ?

An enum type is a symbolic name for an int. Declaring

enum Side a;

is simply declaring 'a' to be an int.
 
C

Chris Torek

(Some compilers actually do some sort(s) of extra checking. No
diagnostics are required here, but compilers are always allowed to
produce as many warning messages as the compiler-writer wants.
Whether or not those extra warning are good or desirable is another
question entirely....)

An enum type is a symbolic name for an int. Declaring

Well, not quite. The enumeration *members* are names for "int"s,
but the enumerated *type* -- here "enum Side" -- is not necessarily
an int. Some compilers either always, or optionally, shorten the
type to the smallest one that will hold its members. (With gcc,
use "-fshort-enum" to turn this on.)
enum Side a;

is simply declaring 'a' to be an int.

Or perhaps a "char" or "unsigned char":

enum A { Zero, One_twenty_seven = 127 }; /* always fits in char */
enum B { Two_fifty_five = 255 }; /* always fits in unsigned char */
enum C { Three_two_seven_six_seven = 32767 }; /* always fits in int */

I have not studied the C99 text closely yet, but my reading of the
C89 standard appears to me to allow even "enum C" to use an unsigned
char, despite the fact that one should be able to do:

enum C x = Three_two_seven_six_seven;

and on real implementations, that would make x have the value 127.
(I would hope that any compiler that did this, even if the C
standards do allow it, would not survive in the marketplace....)

Note that:

enum D { Large = 2147483647 };

is OK if and only if INT_MAX is at least 2147483647, e.g., typical
32-bit systems today. In other words, you can use this in a 32-bit
compiler but not in a 16-bit compiler. (I believe some compilers
have extensions to allow "enum"s to contain long and/or long long
members, but this *is* an extension; the C standards -- C89 and
C99 both -- do not require it.)
 
T

Thomas Matthews

Nebula said:
Consider
enum Side {Back,Front,Top,Bottom};
enum Side a;
Now, why is
a = 124;
legal (well it really is an integer, but still,
checking could be performed..) ?
Wouldn't enums be more useful if there was a
bit more typechecking ?

The enum is neither a set nor a list.

The enum facility is a {convenient} method to
associate numbers with symbols. Nothing more
nothing less.

The facility does not mandate any relationship
amoung the identifiers. There is no requirement
that each identifier have a unique value within
the enum.

The syntax allows one to associate
a name (identifier) to an enum.

The following enums illustrates the point:
enum Nonsense
{Moon = 5, Sun = 73, West = 4, Cake = 41623};

enum Yep {one, two, frog, skip, my, lou};
/* Note that the symbol 'one' will associate */
/* with the value of zero. */

An enum should not be considered as a type.
If you want the enum to be have as a separate
type, such as the RECORD in Pascal, then you
must create one.

Many programming shops ban the enum facility.
They prefer to use #define instead.

Many mistakes are made by placing assumptions
on what an enum should be (set, list, etc.)
rather than what it actually is.

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book
http://www.sgi.com/tech/stl -- Standard Template Library
 
C

CBFalconer

Thomas said:
.... snip ...

An enum should not be considered as a type.
If you want the enum to be have as a separate
type, such as the RECORD in Pascal, then you
must create one.

or simply use a language, such as Pascal or Ada, where enumerations
are grown up types.
Many programming shops ban the enum facility.
They prefer to use #define instead.

Which is bloody foolishness. A major advantage over #define is
that the name remains attached to the use of that particular
constant in symbolic debuggers. Another is that the list of
consecutive constants can easily be revised in one place.
 
C

Chris Croughton

or simply use a language, such as Pascal or Ada, where enumerations
are grown up types.

Or C++, which allows you to cast from int to enum but it has to be done
explicitly, you can't use implicit conversions.
Which is bloody foolishness. A major advantage over #define is
that the name remains attached to the use of that particular
constant in symbolic debuggers. Another is that the list of
consecutive constants can easily be revised in one place.

Agreed, there is a lot of room for error using #define for a set of
constants, it's easy to miss one if you insert one in the middle. With
enums you are guaranteed that the next one will be one greater than the
previous (unless you force it to a value).

The thing which annoys me about enums is that the final value can't have
a comma, unlike initialisers, so adding a new value to the end means
adding a comma to the previous line (and if you do a diff that line
shows up as changed). To get round it I always add a dummy value at the
end:

enum Values
{
VAL_FIRST,
VAL_NEXT_1,
VAL_NEXT_2,
end_of_Values
};

so if I need to add a value I can just add a line

VAL_NEXT_3,

without having to worry about the comma.

Chris C
 
K

Keith Thompson

Chris Torek said:
I have not studied the C99 text closely yet, but my reading of the
C89 standard appears to me to allow even "enum C" to use an unsigned
char, despite the fact that one should be able to do:

enum C x = Three_two_seven_six_seven;

and on real implementations, that would make x have the value 127.
(I would hope that any compiler that did this, even if the C
standards do allow it, would not survive in the marketplace....)

C99 6.7.2.2p4 says:

Each enumerated type shall be compatible with char, a signed
integer type, or an unsigned integer type. The choice of type is
implementation-defined, but shall be capable of representing the
values of all the members of the enumeration. The enumerated type
is incomplete until after the } that terminates the list of
enumerator declarations.

So a C99 compiler could make enum C compatible with unsigned char only
if CHAR_BIT is at least 15.

C90 6.5.2.2 just says:

Each enumerated type shall be compatible with an integer type; the
choice of type is implementation-defined.

so it does appear to allow the kind of thing you describe above.
Presumably this was an unintentional oversight, corrected in C99.
 
K

Keith Thompson

Thomas Matthews said:
An enum should not be considered as a type.
If you want the enum to be have as a separate
type, such as the RECORD in Pascal, then you
must create one.

An enum is a distinct type. An object declared to be of type "enum foo"
is of type "enum foo", and is not of whatever integer type "enum foo"
is compatible with. If "enum foo" is compatible with "int", the
types "enum foo*" and "int *" are incompatible.

But in most contexts an "enum foo" acts just like an integer, and an
expression of type "enum foo" will be implicitly converted to whatever
type is imposed by the context. And, of course, the literals are of
type "int" (which strikes me as one of the siller things in the C
language).

If you want a distinct type that can't be implicitly converted to
another type, a struct is probably the best way to do it.
 
M

Mike Wahler

Keith Thompson said:
An enum is a distinct type. An object declared to be of type "enum foo"
is of type "enum foo", and is not of whatever integer type "enum foo"
is compatible with. If "enum foo" is compatible with "int", the
types "enum foo*" and "int *" are incompatible.

But in most contexts an "enum foo" acts just like an integer, and an
expression of type "enum foo" will be implicitly converted to whatever
type is imposed by the context. And, of course, the literals are of
type "int" (which strikes me as one of the siller things in the C
language).

C has a few things that some deem silly. So is a
zero-terminated array of characters, "silly string?".
Careless use of them does often gum things up.

-Mike
 
C

CBFalconer

Chris said:
.... snip ...

The thing which annoys me about enums is that the final value can't
have a comma, unlike initialisers, so adding a new value to the end
means adding a comma to the previous line (and if you do a diff
that line shows up as changed). To get round it I always add a
dummy value at the end:

enum Values
{
VAL_FIRST,
VAL_NEXT_1,
VAL_NEXT_2,
end_of_Values
};

so if I need to add a value I can just add a line

VAL_NEXT_3,

without having to worry about the comma.

C99 allows the trailing comma, I believe. However a simpler
solution is to write:

enum foo {firstfoo
,foo2
,foo3
};

and I have no problem extending it.
 
L

Lawrence Kirby

An enum is a distinct type. An object declared to be of type "enum foo"
is of type "enum foo", and is not of whatever integer type "enum foo"
is compatible with.

Nevertheless compatible object types behave for all intents and
purposes as the same type.
If "enum foo" is compatible with "int", the
types "enum foo*" and "int *" are incompatible.

C99 6.7.5.1p2 says

"For two pointer types to be compatible, both shall be identically
qualified and both shall be pointers to compatible types."

So in this case enum foo * is compatible with int *.

Lawrence
 
K

Keith Thompson

Lawrence Kirby said:
C99 6.7.5.1p2 says

"For two pointer types to be compatible, both shall be identically
qualified and both shall be pointers to compatible types."

So in this case enum foo * is compatible with int *.

I didn't realize that, thanks.

(I was vaguely thinking of type "char", which is not compatible with
either "signed char" or "unsigned char", even though it has exactly
the same characteristics as one of them.)

Incidentally, this means that the following code fragment:

enum foo { zero, one, two };
enum foo *enum_ptr = NULL;
int *int_ptr = NULL;

int_ptr = enum_ptr;

is legal for some compilers and not for others. For some compilers,
the legality can be affected by command-line options, such as gcc's
"-fshort-enums". (The same thing shows up for typedefs like time_t
and size_t.) I'd be happier if it were illegal for all compilers.
(No, I'm not suggesting a change to the language.)
 
P

Peter Nilsson

Lawrence said:
Nevertheless compatible object types behave for all intents and
purposes as the same type.


C99 6.7.5.1p2 says

"For two pointer types to be compatible, both shall be identically
qualified and both shall be pointers to compatible types."

So in this case enum foo * is compatible with int *.

I don't see how that follows. 6.7.2.2p4 says...

"Each enumerated type shall be compatible with char, a signed
integer type, or an unsigned integer type. The choice of type is
implementation-defined, but shall be capable of representing the
values of all the members of the enumeration."

AFAICS, this grants an implementation the right to set the
size and compatibility type of an enum type arbitrarily on an
enum by enum basis.

[Although, all the compilers I've used (when invoked in conforming
mode) fix the type of enums as int. But I've never understood why.]
 
K

Keith Thompson

Peter Nilsson said:
I don't see how that follows. 6.7.2.2p4 says...

"Each enumerated type shall be compatible with char, a signed
integer type, or an unsigned integer type. The choice of type is
implementation-defined, but shall be capable of representing the
values of all the members of the enumeration."

AFAICS, this grants an implementation the right to set the
size and compatibility type of an enum type arbitrarily on an
enum by enum basis.

We were assuming that enum foo is compatible with int. Given that
assumption, it follows that enum foo* is compatible with int*.
[Although, all the compilers I've used (when invoked in conforming
mode) fix the type of enums as int. But I've never understood why.]

gcc seems to use unsigned int (unless some of the literals are given
negative values).
 
R

Richard Bos

Peter Nilsson said:
AFAICS, this grants an implementation the right to set the
size and compatibility type of an enum type arbitrarily on an
enum by enum basis.

[Although, all the compilers I've used (when invoked in conforming
mode) fix the type of enums as int. But I've never understood why.]

I suspect because it's simpler, and the potential savings in space are
small. Hardly anybody uses millions of enums, but if you'd need larger
code to handle the non-int enums in all the programs that use them for a
couple of tasks, that would add up quite rapidly.

Richard
 
R

Rufus V. Smith

CBFalconer said:
or simply use a language, such as Pascal or Ada, where enumerations
are grown up types.


Which is bloody foolishness. A major advantage over #define is
that the name remains attached to the use of that particular
constant in symbolic debuggers. Another is that the list of
consecutive constants can easily be revised in one place.

I agree this is foolishness.

I prefer enums, but there is one place where they fail. If additional
enums are added to the list, they have to be added to the end, because
if data has been written, for example, to data files, they may no
longer refer to the correct enum. The other way is to store the value
as, for example, a literal, which needs to be re-parsed to the
correct enum value when reading. Not pretty either.

Unless you use the "= constant" form of enum, you might be
safe, but you lose one of the attractions of enum.

Also, some debuggers, AFAIK, can get the #define
constants.

Rufus
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top