PEP 354: Enumerations in Python

T

Tim Chase

Do you anticipate having parameters like socket.AF_INET
Since these are derived from values defined as integers
in C, it's probably better to leave them that way. There
may be code that relies on them being integers or having
those integer values.

I'd say that "because something is done in C, it's the best
way to do it in Python" is a bad line of reasoning :) If I
wanted C, I'd use C. ("some C API functions take pointers
to arbitrary blocks of memory...python should [natively]
implement pointers to arbitrary blocks of memory..." kinda
scares me to be frank...that's why there's the ability to
write clean wrappers in C or C++ to expose such things)

The backwards-compatibility-with-existing-code is a much
better reason :)

-tkc
 
T

Terry Reedy

Ben Finney said:
Should I amend the PEP to propose "either in the builtins or in the
collections module"?

Yes, if the idea is accepted, Guido and devs will debate and decide that
anyway ;-)
Or should I propose two PEPs and let them compete?
No..

But the terminology is broken. The term "enumerated" seems to me to
imply that it does have an order. Can you suggest a term other than
"enumerated" for what you're describing with the unordered property?

Re flags are not ordered. Their bug proneness when given to wrong
functions is a point for your proposal.

tjr
 
S

Steven Bethard

Ben said:
This seems to be a common distinction.

Should I amend the PEP to propose "either in the builtins or in the
collections module"? Or should I propose two PEPs and let them
compete?

I would probably just amend the PEP. I have a feeling that python-dev
is even less likely to accept it as a builtin than python-list is.
Replies to your post indicate this is another popular distinction.

But the terminology is broken. The term "enumerated" seems to me to
imply that it does have an order.

I didn't get that implication. From WordNet 2.0:

"""
enumerate

v 1: specify individually; "She enumerated the many obstacles she had
encountered"; "The doctor recited the list of possible side effects of
the drug" [syn: recite, itemize, itemise] 2: determine the number or
amount of; "Can you count the books on your shelf?"; "Count your change"
[syn: count, number, numerate]
"""

I don't get an ordering out of either of the definitions above. But
certainly there are a few precedents (e.g. Java's Enumeration interface)...
Can you suggest a term other than
"enumerated" for what you're describing with the unordered property?

I don't have any good names if people think that enumeration implies
ordering. Off of thesaurus.reference.com:

* numbering
* inventory
* lexicon
* catalogue

Those were the best I saw, and they're pretty bad. I guess you could go
with ``symbols`` maybe...

STeVe
 
G

greg

Steven said:
You can't shell an egg that isn't there.

Yesterday upon the stair
I shelled an egg that wasn't there.
I'd shell the thing again today
If only I could find a way.
 
G

greg

Ben said:
On the assumption these might be basic types, though, that
name doesn't read so easily in lowercase ('enummember').

Maybe 'enumval'?

I also thought of 'enumber' (from munging together
'enum' and 'member') but that looks too much like
'e-number' rather than 'enum-ber'.
 
G

greg

Paul said:
Do you anticipate having parameters like socket.AF_INET that are
currently integers, become enumeration members in future releases?

Since these are derived from values defined
as integers in C, it's probably better to leave
them that way. There may be code that relies
on them being integers or having those integer
values.
 
G

greg

Giovanni said:
What's the repr of an enumeration value? OTOH, it should be something like
"Weekdays.wed", so that eval(repr()) holds true. Also, it'd be very useful in
debug dumps, tracebacks and whatnot.

That would be nice, but I don't think that's possible
with what the PEP proposes, because in

Weekdays = enum('mon', 'tue', etc...)

there's no way for the enum object to know that it's
meant to be called 'Weekdays'.

A constructor argument could be added for this, but
then you end up having to write the name twice,
making the construct far less elegant.

Maybe *this* is a good argument for making the enum
object a class?

Or maybe it's an argument for allowing decorators
to operate on things other than functions, so you
could write something like

@enum
Weekdays = ('mon', 'tue', etc...)
 
G

greg

Dan said:
In some parts of the world, calendar weeks begin on Monday
and end on Sunday, and in other parts of the world, work weeks begin on
Sunday and end on Thursday.

Things like days of the week really have a circular
ordering, so it doesn't inherently make sense to
ask whether one day of the week is less than or
greater than another.

Maybe there should be a circular_enum type, where
order comparisons even among the *same* type are
disallowed?

Another thought -- should enum values have pred()
and succ() methods, like in Pascal? If so, for
a circular_enum these should wrap around.
 
R

Roy Smith

greg said:
Since these are derived from values defined
as integers in C, it's probably better to leave
them that way. There may be code that relies
on them being integers or having those integer
values.

On a thin API like python's socket module, adding anything which isn't
there in the lower level is a mistake (and making AF_INET an enum member
would be adding something which isn't there).

I just finished adding IPv6 support to a product that didn't have it
before. We've got a "platform independent" socket interface which treats
the address family as opaque data. It turns out, I had to make ZERO
changes to this shim layer. Had the layer known more about address
families, I would have had a lot more work to do.

Consider, for example, Python running on a system with experimental
AF_INET8 support at some point in the future. As it stands now, the Python
library code doesn't need to know there's a new address family if all you
want to do is open raw AF_INET8 sockets.
 
R

Roy Smith

Tim Chase said:
I'd say that "because something is done in C, it's the best
way to do it in Python" is a bad line of reasoning :) If I
wanted C, I'd use C.

The problem is that the C language binding in this case is simply
exposing the even lower level operating system interface. At that
level, the address family is indeed just an arbitrary integer.

The socket man page on (for example) Solaris-9 says things like, "The
currently understood formats are", and "If a protocol is specified by
the caller, then it will be packaged into a socket level option
request and sent to the underlying pro- tocol layers". You don't
really know for sure if you used a valid value until the low-level
protocol drivers look at the number you passed in. This doesn't sound
like an enum to me.

There are plenty of places where we pass around integers that would be
better served by enums. Address families and protocol numbers in the
socket interface just isn't one of them.
 
R

Raymond Hettinger

[Ben Finney]
It is possible to simply define a sequence of values of some other
basic type, such as ``int`` or ``str``, to represent discrete
arbitrary values. However, an enumeration ensures that such values
are distinct from any others, and that operations without meaning
("Wednesday times two") are not defined for these values.

It would be useful for the PEP to add a section that discussed the pros
and cons of this approach (preferably with examples).

For instance, having values distinct from one another is only useful in
the absence of namespace qualifiers (such as Weekdays.fri).

Also, it would be useful to contrast this approach with that used for
Booleans which were implemented as an int subclass. There, the
interoperability with other numbers turned out to be useful on
occasion:

Q = lambda predicate, iterable: sum(predicate(val) for val in
iterable)

Likewise, the PEP's approach precludes a broad class of use cases such
as:

Weekday.fri - Weekday.wed == 2

(where Weekday.fri > Weekday.wed implies that the difference has a
positive value).

If enumerations were implemented as an int subclass, they could serve
as a base class for booleans and enter the language in a unified way.
Likewise, they could be more readily applicable in use cases like
calendar.py which relies on math ops being defined for the days of the
week.

I would like to see the PEP develop these arguments more fully so that
the correct approach will be self evident based on the merits.


Raymond
 
P

Peter Maas

Paul said:
Do you have a good usage case for the number
647574296340241173406516439806634217274336603815968998799147348150763731 ?

Yes, it could be the value of my property in any currency :) But your
argument is wrong. There's no logical or even aesthetical reason to
have empty enums. An empty set is an instance of the set type like 0
or your big number are instances of the integer type and are neccessary
to make some operations complete. But an empty enum is a *type* without
values, therefore a type that cannot be instantiated i.e. a useless
type.

I don't like Python enums at all. It is kind of "I want to have that
C++ thing, too". In Python enums can be emulated so there's no need
to have syntactical support for them.

Peter Maas, Aachen
 
T

Terry Hancock

This seems to be a common distinction.

Should I amend the PEP to propose "either in the builtins
or in the collections module"? Or should I propose two
PEPs and let them compete?

My recommendation is to drop the "builtins" suggestion
altogether. If it goes into the stdlib and becomes popular,
write a PEP suggesting it move to builtins *then*. That
would follow the example of "sets".

I for one think that builtins is already too large (though
I admit I'm unsure what should be removed). Enumerations
are used infrequently enough that they should be a module
(e.g. "decimal" numbers are probably more common).
Replies to your post indicate this is another popular
distinction.

But the terminology is broken. The term "enumerated" seems
to me to imply that it does have an order. Can you suggest
a term other than "enumerated" for what you're describing
with the unordered property?

Well, I personally don't find the term that broken, but as
I say, the term "Vocabulary" has been used. There is "Open
Vocabulary" and "Closed Vocabulary" to define the mutable
and immutable cases respectively. But they're long names.

Plus, you need a name for the individual values. So there
are a number of possibilities:

EnumValue / Enum
Symbol / SymbolSet
Word / Vocabulary
Symbol / Vocabulary (what I used for "open" case)
Symbol / Enum (what I used for "closed" case)
Word / WordSet
Label / LabelSet

I'm sure there are others.

Word has the disadvantage of also meaning "32-bit integer"
to a lot of CS people. Or perhaps a string. Not
immediately an indivisible symbol.

Symbol probably has a dozen overloaded meanings, though
I'm not sure what other people will think when they read it
(it actually seems right to me, obviously).

Label was what I first called these. But I realize that
label describes a probably use of a symbol, not the symbol
itself.
 
T

Terry Hancock

Most people seem to be unopposed or in favour of some kind
of enumeration mechanism making it at least as far as the
standard library.

.... but not to built-ins. That seems about right to me.
As I understand it, the current outstanding debates are::

* Builtin types versus stdlib module (maybe
'collections')

* Ordered sequences versus unordered iterables

* Immutable versus mutable

I suggest that both are called for, but would have different
names -- the Immutable is the actual "Enum", the mutable is
probably a "Vocabulary" or something else.
* Bad comparisons: raise exception versus return
NotImplemented

It should raise an error, because client code should use
enumerated values if enumerated values are spec'd in the
API.
* Terminology for each of these concepts

+ Tracing of individual "EnumValues" (or "symbols"?) to
their enum, I should be able to interrogate a value to
find what enum it comes from in addition to being able
to interrogate an enum to find out what values belong
to it.

Which is more computationally efficient will depend
on the application, and a single application might
do better to use each for different tasks, so I think
it should be reversible.

+ How about documentation of enumerated values? (Where
does the enum's __doc__ go?). One of the main values
of using enumerated values is as an aid to documentation,
but "WED" is still vague. Could be "Wednesday", could
be the "Western Education District" or short for
"Wedding". Enumerations are most frequently used in
module APIs, so they are important to document.

Obviously, the point is so that documentation tools like
epydoc can capture the enumeration documentation.
 
R

Roy Smith

Should I amend the PEP to propose "either in the builtins
I see the issue of whether it's a built-in or part of the standard library
as being a detail. My personal opinion is that they're important enough to
be built in, but having to write "import enum" (or whatever) won't kill me.
If making it a builtin becomes a point of contention, drop it and
concentrate on the more key issues of how enums will behave.
 
B

Ben Finney

Paul Rubin said:
I don't know about this. It makes athlon64_instructions a
completely separate enum from pentium_instructions. It could be
that athlon64_instructions.add should be the same as
pentium_instructions.add .

If you want the members of two distinct collections to have a
meaningful relationship, you don't want enumerations. The basic
promise of the specified enumeration interface is that every member of
an enumeration is a unique value.
 
P

Paul Rubin

Ben Finney said:
If you want the members of two distinct collections to have a
meaningful relationship, you don't want enumerations. The basic
promise of the specified enumeration interface is that every member of
an enumeration is a unique value.

The premise is that they're not necessarily distinct (disjoint)
collections; one of them has been explicitly created as a superset of
the other.
 
C

Christos Georgiou

This seems great, except why can't I compare strings? It seems too
useful when dealing with user input, or parsing messages or config
files.

some_value = Weekdays.thu
....
user_input = raw_input("Enter day name")
if user_input == str(some_value):
Additionaly, perhaps the call method of the enumeration object should
construct a value from strings?

Either way works for me.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,197
Latest member
Sean29G025

Latest Threads

Top