enum question

J

James Brown

Hi,

I have the following enum declared:

enum TOKEN { TOK_ID = 1000, TOK_NUMBER, TOK_STRING, /*lots more here*/ };


What I am trying to do is _also_ represent ASCII values 0-127 as TOKENs
(this is
why I started the TOKEN enum off at '1000' so I had plenty of space at the
start....and I don't really want to type out 127 values into my enum
declaration....

So this is what I would like to do in my code:

TOKEN t1 = TOK_ID; // ok
TOKEN t2 = 5; // compile error (cannot convert from
const int to 'enum TOKEN')
TOKEN t3 = TOKEN(5); // compiles ok but maybe undefined behaviour??

int ch = 'X';
TOKEN t4 = TOKEN(ch); // compiles but I think it's illegal???

could someone clarify if the 3rd example is ok or not, and what type
of problem I might expect if it isn't ok? I read the section 29.18 in the
FAQ
which says:

"But be sure your integer is a valid enumeration value.
If you go provide an illegal value, you might end up with something other
than
what you expect. The compiler doesn't do the check for you; you must do it
yourself"

Now I deliberately want to put an illegal value into an enum type...for my
particular
scenario this seems the neatest way to do what I want. When the FAQ says I
might end up with something unexpected, does this refer to undefined
behaviour
as described by the C++ standard or does it simply mean "be careful there is
no compile-time check for what you are trying....but it is ok at runtime???"

If I converted my TOKEN value back to an integer, would the value
"roundtrip"??

TOKEN t5 = TOKEN(5)
int num = int(t5); /* would val always contain 5 on every c++ compiler???
*/

can anybody clarify my confusion or suggest an alternate solution?

tia,
James
 
A

Alf P. Steinbach

* James Brown:
I have the following enum declared:

enum TOKEN { TOK_ID = 1000, TOK_NUMBER, TOK_STRING, /*lots more here*/ };

Don't use all uppercase for non-macro names. Reserve that for macro names,
and always use all uppercase for macro names. That way you avoid name
collisions and nasty surprises where a macro "changes" you source code (as
with e.g. the lowercase min and max macros in the standard Windows headers).

So this is what I would like to do in my code:

TOKEN t1 = TOK_ID; // ok
TOKEN t2 = 5; // compile error (cannot convert from
const int to 'enum TOKEN')
Right.


TOKEN t3 = TOKEN(5); // compiles ok but maybe undefined behaviour??

It's OK.
int ch = 'X';
TOKEN t4 = TOKEN(ch); // compiles but I think it's illegal???

It's exactly the same as previous example, just a different value.

When the FAQ says I might end up with something unexpected, does this
refer to undefined behaviour as described by the C++ standard or does
it simply mean "be careful there is no compile-time check for what you
are trying....but it is ok at runtime???"

The latter.

If I converted my TOKEN value back to an integer, would the value
"roundtrip"??

Yes, if the value is within the range of the integer type you convert to.

TOKEN t5 = TOKEN(5)
int num = int(t5); /* would val always contain 5 on every c++ compiler???
*/

Yes.
 
V

Victor Bazarov

James said:
I have the following enum declared:

enum TOKEN { TOK_ID = 1000, TOK_NUMBER, TOK_STRING, /*lots more
here*/ };

What I am trying to do is _also_ represent ASCII values 0-127 as
TOKENs (this is
why I started the TOKEN enum off at '1000' so I had plenty of space
at the start....and I don't really want to type out 127 values into
my enum declaration....

So this is what I would like to do in my code:

TOKEN t1 = TOK_ID; // ok
TOKEN t2 = 5; // compile error (cannot convert
from const int to 'enum TOKEN')
TOKEN t3 = TOKEN(5); // compiles ok but maybe undefined
behaviour??

The behaviour of this one is well-defined. It would only be undefined
if the value you're casting can't fit into the underlying integral type.
Anything less than '1000' should fit, so don't worry.
int ch = 'X';
TOKEN t4 = TOKEN(ch); // compiles but I think it's illegal???

Why do you think it is illegal?
could someone clarify if the 3rd example is ok or not, and what type
of problem I might expect if it isn't ok? I read the section 29.18 in
the FAQ
which says:

"But be sure your integer is a valid enumeration value.
If you go provide an illegal value, you might end up with something
other than
what you expect. The compiler doesn't do the check for you; you must
do it yourself"

The point here is that if you have

enum blah { FOO = 1, BAR = 5, FOOBAR = 15 };
blah value = blah(77);

there is no enumerators in 'blah' corresponding to the value 77. If
some of your code relies on 'value' to be only one of the enumerators,
it will not work. That's where the advice to check it yourself comes
from.
Now I deliberately want to put an illegal value into an enum
type...

There is no such thing as "illegal value" AFA enums are concerned. It
is only "illegal" in terms of your problem domain. Just so that we are
clear about terminology...
for my particular
scenario this seems the neatest way to do what I want. When the FAQ
says I might end up with something unexpected, does this refer to
undefined behaviour
as described by the C++ standard or does it simply mean "be careful
there is no compile-time check for what you are trying....but it is
ok at runtime???"

Again, AFA the value can be represented in the underlying integral type,
the behaviour is well defined.
If I converted my TOKEN value back to an integer, would the value
"roundtrip"??

It will "survive", if that's what you're asking.
TOKEN t5 = TOKEN(5)
int num = int(t5); /* would val always contain 5 on every c++
compiler??? */
Yes.

can anybody clarify my confusion or suggest an alternate solution?

Your solution is fine. Think of the code maintenance, however. What
if somebody later adds an enumerator with the value '5' to the enum
declaration/definition? Suddenly, in your terms, '5' becomes "legal".
Perhaps you should think of a different approach...

V
 
J

Jonathan Mcdougall

James said:
Hi,

I have the following enum declared:

enum TOKEN { TOK_ID = 1000, TOK_NUMBER, TOK_STRING, /*lots more here*/ };

To quote TCPPPL3:

"enum e1 { dark, light}; // range 0:1

[...]

The sizeof an enumeration is the sizeof some integral type that can
hold its range and not larger than sizeof(int), unless an enumerator
cannot be represented as an int or as an unsigned int. For example,
sizeof(e1) could be 1 or maybe 4 but not 8 on a machine where
sizeof(int)==4.

So the exact size of TOKEN depends on the implementation, but you can
safely assume it can accomodate any values smaller than the larger enum
element. In your example, without knowing what "lots more here" is,
every value between 0 and 1002 is guaranteed to be valid (enums may be
unsigned).
What I am trying to do is _also_ represent ASCII values 0-127 as TOKENs
(this is
why I started the TOKEN enum off at '1000' so I had plenty of space at the
start....and I don't really want to type out 127 values into my enum
declaration....

Values 0-127 are valid within this enumeration.
So this is what I would like to do in my code:

TOKEN t1 = TOK_ID; // ok
TOKEN t2 = 5; // compile error (cannot convert from
const int to 'enum TOKEN')
Yep.

TOKEN t3 = TOKEN(5); // compiles ok but maybe undefined behaviour??

No, this is well defined.
int ch = 'X';
TOKEN t4 = TOKEN(ch); // compiles but I think it's illegal???

The value of 'X' is implementation-defined. C++ can run on other sets
than ASCII where 'X' maybe any value. However, 'X' is a char. Whether
it is signed or unsigned is again implementation-defined. The standard
says sizeof(char) == 1 byte, but a byte can have any number of bits
greater than 8.

The conclusion is, 'X' can have any value. For example, a machine may
have 32-bit char and use a character set in which 'X' is 2000. Your
example would be undefined-behavior in this case.

However, if you are sure your program will always run on ASCII
machines, you may assume that that 'X' will always be < 256, which fits
your design. The problem is, this is not portable.
If I converted my TOKEN value back to an integer, would the value
"roundtrip"??

TOKEN t5 = TOKEN(5)
int num = int(t5); /* would val always contain 5 on every c++ compiler???
*/

Yes.


Jonathan
 
J

James Brown

Alf P. Steinbach said:
* James Brown:

Don't use all uppercase for non-macro names. Reserve that for macro
names,
and always use all uppercase for macro names. That way you avoid name
collisions and nasty surprises where a macro "changes" you source code (as
with e.g. the lowercase min and max macros in the standard Windows
headers).



It's OK.


It's exactly the same as previous example, just a different value.



The latter.



Yes, if the value is within the range of the integer type you convert to.



Yes.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?

Brilliant, that answered my question perfectly, thanks!

-and I will change my enum to lower-case as you suggest :)

James
 
J

James Brown

Victor Bazarov said:
The behaviour of this one is well-defined. It would only be undefined
if the value you're casting can't fit into the underlying integral type.
Anything less than '1000' should fit, so don't worry.


Why do you think it is illegal?


The point here is that if you have

enum blah { FOO = 1, BAR = 5, FOOBAR = 15 };
blah value = blah(77);

there is no enumerators in 'blah' corresponding to the value 77. If
some of your code relies on 'value' to be only one of the enumerators,
it will not work. That's where the advice to check it yourself comes
from.


There is no such thing as "illegal value" AFA enums are concerned. It
is only "illegal" in terms of your problem domain. Just so that we are
clear about terminology...


Again, AFA the value can be represented in the underlying integral type,
the behaviour is well defined.


It will "survive", if that's what you're asking.


Your solution is fine. Think of the code maintenance, however. What
if somebody later adds an enumerator with the value '5' to the enum
declaration/definition? Suddenly, in your terms, '5' becomes "legal".
Perhaps you should think of a different approach...

V

thanks for answering - it was the terminology that was confusing me in
the FAQ I think - but you've helped me understand the potential issues with
casting from enum -> integers anyway :)

The primary reason I am using enum is that my debugger shows up the
enum-names
as I am debugging......maybe I should use "const int" instead - but
although my
solution may seem a little strange it is actually very neat for me to do it
this way.

thanks,
James
 
G

Greg

James said:
Hi,

I have the following enum declared:

enum TOKEN { TOK_ID = 1000, TOK_NUMBER, TOK_STRING, /*lots more here*/ };

By convention, only the names of macros are all in uppercase. So I
would suggest "Token" for the enum name, and any of: number_token,
numberToken, eNumberToken, kNumberToken for the enumeration. Generally
it's better for the term in common to appear at the end of the name.
Otherwise, by appearing at the beginning, the common term makes it
harder to tell the names apart.
What I am trying to do is _also_ represent ASCII values 0-127 as TOKENs
(this is
why I started the TOKEN enum off at '1000' so I had plenty of space at the
start....and I don't really want to type out 127 values into my enum
declaration....

So this is what I would like to do in my code:

TOKEN t1 = TOK_ID; // ok
TOKEN t2 = 5; // compile error (cannot convert from
const int to 'enum TOKEN')
TOKEN t3 = TOKEN(5); // compiles ok but maybe undefined behaviour??

if no enumeration corresponds to the value 5, then technically the enum
t3's value is "unspecified" after the conversion. So the good the
conversion will not crash the program - the bad the enum may not
"roundtrip" back to 5. Though I would be very surprised to find a
compiler that didn't roundtrip the value. This is probably a case of
the Standard cutting compiler writers some slack; it likes to do so
when it can, to make up for all the other complicated requirements that
the Standard imposes on them.
int ch = 'X';
TOKEN t4 = TOKEN(ch); // compiles but I think it's illegal???

No, it is not illegal, but it's not good programming form either, at
least in my view. Enums are meant to represent variety of a type, not a
range of values. A program that converts between enums and integers is
probably a program that should be using some other type.
could someone clarify if the 3rd example is ok or not, and what type
of problem I might expect if it isn't ok? I read the section 29.18 in the
FAQ
which says:

"But be sure your integer is a valid enumeration value.
If you go provide an illegal value, you might end up with something other
than
what you expect. The compiler doesn't do the check for you; you must do it
yourself"

Now I deliberately want to put an illegal value into an enum type...for my
particular
scenario this seems the neatest way to do what I want. When the FAQ says I
might end up with something unexpected, does this refer to undefined
behaviour
as described by the C++ standard or does it simply mean "be careful there is
no compile-time check for what you are trying....but it is ok at runtime???"

If I converted my TOKEN value back to an integer, would the value
"roundtrip"??

TOKEN t5 = TOKEN(5)
int num = int(t5); /* would val always contain 5 on every c++ compiler???
*/

can anybody clarify my confusion or suggest an alternate solution?

In summary, the value may not be preserved, but in practice, I would
bet that it would be. I still would not write code that used enums in
this way, but only because I don't see the point of assigning an enum
an integer value that is not part of the enumeration. Wouldn't it be
more straightforward to use a typedef:

typedef int Token;

and then declare const ints for known token types:

const int kNumberToken = 1001;
const int kStringToken = 1002;

The program would then be able to assign 0-127 to a Token type without
creating unenumerated enums. Granted the typedef doesn't provide much
type safety, but storing arbitrary int values in an enum violates type
safety itself, so the enum technique cannot be considered any better,
and is certainly not any safer than the typedef approach.

Furthermore writing a Token class would provide the type safety of the
enum and the flexibility of the typedef, with a little more work:

class Token
{
private:
int value;
public:
explicit Token(int e) : value(e)
{
}
...
};

and then:

const Token kNumberToken(1001);
const Token kStringToken(1002);

Greg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

enum question 10
Scoped enum question 21
Lexical Analysis on C++ 1
enum 4
ANN: Python 3 enum package 0
Enum oddity 14
Possible to promote scoped enum values to enclosing namespace? 5
Enum Idiom Question 27

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,265
Latest member
TodLarocca

Latest Threads

Top