Double underscore names

S

Steven D'Aprano

Double-underscore names and methods are special to Python. Developers are
prohibited from creating their own (although the language doesn't enforce
that prohibition). From PEP 0008, written by Guido himself:

__double_leading_and_trailing_underscore__: "magic" objects or
attributes that live in user-controlled namespaces. E.g. __init__,
__import__ or __file__. Never invent such names; only use them
as documented.

http://www.python.org/dev/peps/pep-0008/



But then there are modules like doctest, which uses the special double-
underscore name __test__.

There are times where I would like to create my own protocol, like the
early sequence protocol: if an object has a __getitem__ method, it can be
used with for loops. Python calls obj.__getitem__(i) with i starting at 0
and increasing by one each time until it gets an IndexError exception.

This sequence protocol has been made partly obsolete by the iterator
protocol, but you see the point: in the spirit of duck typing, having the
ability to create a protocol is a Very Good Thing, and double leading and
trailing underscore names are the accepted Python style for such special
methods.

But the style guide says not to do that.

So I find myself conflicted. I'm aware that the style guide also says,
know when to break the rules, but the examples given don't seem to apply
to this case. The prohibition against inventing new double-underscore
names like __parrot__ seems much stronger than the other style guides.

So what do folks think? I believe the protocol idiom ("look for a method
called __parrot__ and then do something with it") is too useful and
powerful to be ignored, but then if __parrot__ is reserved by Python,
what to do?
 
B

Ben Finney

Steven D'Aprano said:
having the ability to create a protocol is a Very Good Thing, and
double leading and trailing underscore names are the accepted Python
style for such special methods.

Is it? There are many protocols that use plain names. Even the
built-in types support many "protocol" operations via plain-named
attributes.

dict.get, dict.items, dict.keys, dict.values

list.insert, list.append, list.extend, list.remove

file.read, file.write, file.close

The double-underscore convention seems more for attributes *that are
interpreted specially*, e.g. by syntax operators or other core
language features.

Go ahead and implement your protocol using attributes with plain
names. What makes it a protocol is that it's likely to be generally
applicable, and the names are chosen to apply to other classes equally
well.
 
C

Christian Heimes

Steven said:
So what do folks think? I believe the protocol idiom ("look for a method
called __parrot__ and then do something with it") is too useful and
powerful to be ignored, but then if __parrot__ is reserved by Python,
what to do?

The Python core claims all rights for __magic__ methods with a leading
and trailing double underscore. Of course Python won't stop you from
introducing your own magic names but you are on your own.

Future versions of Python may introduce a magic hook with the same name.
Be warned and don't complain ;)

Christian
 
S

Steven Bethard

Steven said:
Double-underscore names and methods are special to Python. Developers are
prohibited from creating their own (although the language doesn't enforce
that prohibition). From PEP 0008, written by Guido himself:

__double_leading_and_trailing_underscore__: "magic" objects or
attributes that live in user-controlled namespaces. E.g. __init__,
__import__ or __file__. Never invent such names; only use them
as documented.

http://www.python.org/dev/peps/pep-0008/

But then there are modules like doctest, which uses the special double-
underscore name __test__.

There are times where I would like to create my own protocol, like the
early sequence protocol: if an object has a __getitem__ method, it can be
used with for loops. Python calls obj.__getitem__(i) with i starting at 0
and increasing by one each time until it gets an IndexError exception.

This sequence protocol has been made partly obsolete by the iterator
protocol, but you see the point: in the spirit of duck typing, having the
ability to create a protocol is a Very Good Thing, and double leading and
trailing underscore names are the accepted Python style for such special
methods.

But the style guide says not to do that.

So I find myself conflicted. I'm aware that the style guide also says,
know when to break the rules, but the examples given don't seem to apply
to this case. The prohibition against inventing new double-underscore
names like __parrot__ seems much stronger than the other style guides.

So what do folks think? I believe the protocol idiom ("look for a method
called __parrot__ and then do something with it") is too useful and
powerful to be ignored, but then if __parrot__ is reserved by Python,
what to do?

I've come to believe that in most cases when you want to establish a new
protocol, you'd be better off defining a generic function and
overloading it rather than using the __magic__ attribute approach. See,
for example, the simplegeneric module:

http://pypi.python.org/pypi/simplegeneric/

Generic functions have the advantage that when one of your users
discovers someone else's class that is missing support for your
protocol, then can add it to that class without having to edit the code
of the class (e.g. so that they can add a __magic__ attribute). Instead,
they can just register an appropriate overload for that class.

STeVe
 
E

Erik Max Francis

Ben said:
The double-underscore convention seems more for attributes *that are
interpreted specially*, e.g. by syntax operators or other core
language features.

I would qualify that by adding that it's for attributes that are treated
specially _and when you don't want to overload other names_, especially
when you wouldn't just call the thing normally in the course of dealing
with the object's interface. i.e., a[x] --> a.__getitem__(x), but we
don't want to prevent you from (perhaps blissfully unaware) defining
your own getitem method.
Go ahead and implement your protocol using attributes with plain
names. What makes it a protocol is that it's likely to be generally
applicable, and the names are chosen to apply to other classes equally
well.

Agreed.
 
S

Steven D'Aprano

Is it? There are many protocols that use plain names. Even the built-in
types support many "protocol" operations via plain-named attributes.

dict.get, dict.items, dict.keys, dict.values

list.insert, list.append, list.extend, list.remove

I don't believe any of those define a protocols. They are merely methods.
An example of a protocol would be dict.__getitem__(): you can retrieve
items from any object using the square bracket syntax if it has a
__getitem__ method.

However, having said that, I find myself in the curious position of
(reluctantly) agreeing with your conclusion to "Go ahead and implement
your protocol using attributes with plain names" even though I disagree
with virtually every part of your reasoning. In my mind, the two deciding
factors are:

(1) Guido's original style guide essay allowed developers to use double-
underscore names under certain circumstances. It seems that this has been
deliberately dropped when it became a PEP.

http://www.python.org/doc/essays/styleguide.html#names

(2) PEP 3114 "Renaming iterator.next() to iterator.__next__()" makes it
explicit that names of the form __parrot__ are intended to be "a separate
namespace for names that are part of the Python language definition".

Given that, I think that the special name __test__ in the doctest module
is possibly an anomaly: it's not part of the language, but it seems to
have the BDFL's blessing, or at least the BDFL didn't notice it. It's
hardly the only anomaly in the standard library though, so one shouldn't
draw too many conclusions from it. I think the intention is clear: names
like __parrot__ are strictly verboten for user code.

If and when my modest little programming efforts are considered for
inclusion in the standard library, I will rethink this issue. In the
meantime, no more __parrots__.

You can stop reading now. But for those who care, I have a few final
comments.


file.read, file.write, file.close

I would accept that these three are part of the "file object protocol",
However, they seem to be the odd one out: every other Python protocol
that I can think of uses the __parrot__ form.

iterator.next() is the exception, but that's explicitly been slated for
change to __next__() in Python 3.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,534
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top