Method Underscores?

Chris S. · Oct 21, 2004

Is there a purpose for using trailing and leading double underscores for
built-in method names? My impression was that underscores are supposed
to imply some sort of pseudo-privatization, but would using
myclass.len() instead of myclass.__len__() really cause Python
considerable harm? As much as I adore Python, I have to admit, I find
this to be one of the language's most "unPythonic" features and a key
arguing point against Python. I've searched for a discussion on this
topic in the groups archives, but found little. What are everyone's
thoughts on this subject?

Josiah Carlson · Oct 21, 2004

Chris S. said:
Is there a purpose for using trailing and leading double underscores for
built-in method names? My impression was that underscores are supposed
to imply some sort of pseudo-privatization, but would using
myclass.len() instead of myclass.__len__() really cause Python
considerable harm? As much as I adore Python, I have to admit, I find
this to be one of the language's most "unPythonic" features and a key
arguing point against Python. I've searched for a discussion on this
topic in the groups archives, but found little. What are everyone's
thoughts on this subject?

Double underscore methods are considered "magic" methods. The
underscores are a hint that they may do something different. Kind of
like the C++ friend operators.

In terms of .len() vs .__len__(), it is not supposed to be called
directly by user code; __len__() is called indirectly by the len()
builtin (and similarly for the other __<op>__() methods, check common
spellings in the operator module).

class foo:
def __len__(self):
return 4

a = foo()
len(a) #like this

- Josiah

Alex Martelli · Oct 21, 2004

Chris S. said:
Is there a purpose for using trailing and leading double underscores for
built-in method names?

They indicate that a method is special (not 'built-in'). One that
causes Python to call it implicitly under certain circumstances.

So, for example, a class which happened to define method iter would not
start behaving strangely when 'iter' acquired a special meaning in some
future version of the language: the special meaning if it comes will be
instead put on __iter__ . This has indeed happened (in 2.2).
Otherwise, you'd have the same problem with special methods as you do
with keywords: introducing one is a _major_ undertaking since it risks
breaking backwards compatibility (built-in names do not have that risk;
it may not be obvious but some reflection will show that).

(( The general practice of marking some class of identifiers with
special characters to distinguish them from others is known as stropping
and was introduced in early Algol 60 implementations, to distinguish
keywords from names; the Algol standard used roman font versus italics
for the purpose, but that didn't translate well to punched cards! ))

My impression was that underscores are supposed
to imply some sort of pseudo-privatization,

Leading-only underscores do. Underscores in the middle imply nothing,
it's just a style of making an identifier from many words; some like
this_way, some like thatWay. Trailing-only underscore normally is used
when otherwise an identifier would be a keyword, as in 'class_' or
'print_' (you need some convention for that when you're interfacing
external libraries -- ctypes, COM, Corba, SOAP, etc, etc -- since
nothing stops the external library from having defined a name which
happens to clash with a Python keyword). Leading AND trailing double
underscores imply specialness.

but would using
myclass.len() instead of myclass.__len__() really cause Python
considerable harm?

If you were designing Python from scratch, the tradeoff would be:
-- unstropped specialnames are easier to read, but
-- future evolution of the language will be severely hampered (or
else backwards compatibility will often get broken).

So it's a tradeoff, just like the choice of stropping or not for
barenames (identifiers); Perl strops because Larry Wall decided early on
he wanted lots of easy evolution (there's also a tradition of stropping
identifiers in scripting languages, from EXEC to the present; sometimes
under guise of _substitution_, where an identifier being bound is not
stropped but it needs stropping to be used, as in sh and tcl; Rexx and
Python deliberately reject that tradition to favour legibility).

I think Guido got that design choice right: unstropped barenames for all
normal uses, unstropped keywords, pay the price whenever a keyword needs
to be added (that's rarely), stropped-by-convention specialnames
(they're way rarer than barenames in general, _and_ the addition of
specialnames is more frequent).

On the specific lexical-sugar issue of what punctuation characters to
use for this stropping, I pass; the double underscores on both sides are
a bit visually invasive, other choices might have been sparer, but then
I guess that part of the choice was exactly to make the specialness of
specialnames stand out starkly. Since it's unlikely I'll soon need to
design a Python-like language and thus to decide on exactly how to strop
specialnames, it's blissfully superfluous for me to decide;-).

Alex

Chris S. · Oct 21, 2004

Josiah said:
Double underscore methods are considered "magic" methods. The
underscores are a hint that they may do something different. Kind of
like the C++ friend operators.

In terms of .len() vs .__len__(), it is not supposed to be called
directly by user code; __len__() is called indirectly by the len()
builtin (and similarly for the other __<op>__() methods, check common
spellings in the operator module).

I realize that. My point is why? Why is the default not object.len()?
The traditional object oriented way to access an object's attribute is
as object.attribute. For those that are truly attached to the antiquated
C-style notation, I see no reason why method(object) and object.method()
cannot exist side-by-side if need be. Using method(object) instead of
object.method() is a throwback from Python's earlier non-object oriented
days, and something which should be phased out by p3k. Personally, I'd
like to see the use of underscores in name-mangling thrown out
altogether, as they "uglify" certain code and have no practical use in a
truly OO language, but I'm sure that's a point many will disagree with
me on.

Andrew Dalke · Oct 21, 2004

Chris said:
Is there a purpose for using trailing and leading double underscores for
built-in method names? My impression was that underscores are supposed
to imply some sort of pseudo-privatization,

They are used to indicate special methods used by Python
that shouldn't be changed or overridden without knowing what
you are doing.

but would using
myclass.len() instead of myclass.__len__() really cause Python
considerable harm?

One way to think of it is as a sort of namespace. Python
needs some specially names methods so that

print a == b

works correctly ('__eq__' or '__cmp__' for the comparison,
and '__str__' for the stringification for printing).
These could be normal looking functions (meaning without
leading and trailing underscores) but then there's
the worry that someone will override that by accident.

By making them special in a way that most people wouldn't
use normally and formally stating that that range is
reserved for system use, that accident won't happen.
No one will accidently do

def search(self, name):
....
self.str = "Looking for " + name

only to find latter that 'str' is needed for printing
the stringified version of the instance.

Or consider the other way around. Suppose everyone decides
we need a new protocol for iteration (pretend this is a
few years ago). The new protocol requires a new method
name. What should it be called?

If there isn't a reserved subspace of the namespace then
any choice made will have a chance of interfering with
existing code. But since "__"+...+"__" is reserved, it
was easy to add special meaning to "__iter__" and know
that no existing code would break.

In most cases the builtin function interface to those methods
does more than forward the call. For example, iter()
supports both a 2nd parameter sentinel and fall-back support
for lists that don't provide an __iter__.

Of the several dozen special methods, how many would
you really call directly? It's unlikely you would call
__add__, __mul__, __lt__, ... if only because you would
loose the support for the given operation

My guess is you'll have len(), abs(), maybe iter(),
but not many more. Should only that handful have
non-special names or should all special methods have
non-special names? If the first, why is that handful
so special?

Do you prefer -x or x.inv() ?

> As much as I adore Python, I have to admit, I find
this to be one of the language's most "unPythonic" features and a key
arguing point against Python. I've searched for a discussion on this
topic in the groups archives, but found little. What are everyone's
thoughts on this subject?

I've not thought so. I find it helps me not worry about
conflicting with special method names. So long as I
don't use the "__"*2 words I'm free to use whatever I want.
Even reserved words if I use getattr/setattr/delattr.

There has been discussion, but I only found a comment
from Guido about 5 years ago saying the "__" are meant for
system names and from Tim Peters about 2 years back saying
that there are only a few methods you might call directly.
And a post of mine saying that I like this feature of Python
compared to in Ruby where special methods are syntactically
indistinguishable from user-defined methods.

Andrew
(e-mail address removed)

Ville Vainio · Oct 21, 2004

Chris> object.method() cannot exist side-by-side if need be. Using
Chris> method(object) instead of object.method() is a throwback
Chris> from Python's earlier non-object oriented days, and
Chris> something which should be phased out by p3k. Personally,

One of the strengths of Python is the widespread pragmatic approach
towards OOP - not everything needs to implemented in a class, even if
everything is an object. I don't think there is a trend towards
replacing 'functional' stuff with more explicitly OO stuff either.

Andrew Dalke · Oct 21, 2004

Chris said:
> Using method(object) instead of object.method() is
> a throwback from Python's earlier non-object oriented days,

Pardon? It's been O-O since at least version 0.9 over
10 years ago.

> Personally, I'd
like to see the use of underscores in name-mangling thrown out
altogether, as they "uglify" certain code and have no practical use in a
truly OO language, but I'm sure that's a point many will disagree with
me on.

I and others have made some comments already, but let me add
one more.

Given a C++ viewpoint, one way to think of Python programming
is that it's like template based programming. The Python
function

def sum(data):
x = iter(data)
tot = x.next()
for item in x:
tot += item
return tot

works on any data container so long as that container
can be iterated and its values can be summed. (And the
container must have at least one item.)

>>> sum(["This", "is", "a", "test"]) 'Thisisatest'
>>>

Click to expand...

Click to expand...

The way to do that in C++ is with a template. I tried
to figure out how to do it but my C++ skills are about
8 years rusted. Perhaps something like .... ?

template <typename Container, typename T>
T sum(Container container) {
T val;
for (Container::const_iterator it = container.first();
it != container.last(); ++it) {
val += *it;
}
return val;
}

You would not do it using OO containers because then
you're left with Java-style casting everywhere. (Sun
has added generics in the most recent version of Java.
See Bruce Eckel's commentary on the topic at
http://www.mindview.net/WebLog )

If you accept that Python program has a strong template
aspect, in addition to O-O programming, then the builtins
like 'abs', 'cmp', etc. can be seen as generic algorithms
and not methods of a given data type.

Granted, some of the algorithms are trivial, as for
abs(), but the 'cmp()' algorithm is quite involved.

That means your statement, that these functions "have
no practical use in a truly OO language" is not applicable,
because Python is a mixed OO/imperative/template-based
programming language.

Andrew
(e-mail address removed)

Alex Martelli · Oct 21, 2004

Chris S. said:
I realize that. My point is why?

I think both I and Andrew answered that: by stropping specialnames, any
new version of Python can add a specialname without breaking backwards
compatibility towards existing programs written normally and decently.

Why is the default not object.len()?
The traditional object oriented way to access an object's attribute is
as object.attribute.

It's one widespread way, dating from Simula, but there are others --
Smalltalk, arguably the first fully OO language, uses juxtaposition
("object message"), as does Objective-C; Perl currently uses -> (though
Perl 6 will use dots); and so on. But this lexical/syntactical issue
isn't really very relevant, as long as we're talking *single-dispatch*
OO languages (a crucial distinction, of which, more later).

The main reason <builtin-name>(*objects) constructs exist is quite
different: in the general case, such constructs can try _several_ ways
to perform the desired operation, depending on various special methods
that the objects' classes might define. The same applies to different
(infix or prefix) syntax sugar like, say, "a + b", which has exaclty the
same semantics as operator.add(a, b).

Consider the latter case, for example. operator.add(a, b) is NOT the
same thing as a.__add__(b). It does first try exactly that, _if_
type(a) defines a specialmethod __add__ [[net of coercion issues, which
are a complication we can blissfully ignore, as they're slowly fading
away in the background, thanks be]]. But if type(a) does not define
__add__, or if the call to type(a).__add__(a, b) returns NotImplemented,
then operator.add(a, b) continues with a second possibility: if type(b)
defines __radd__, then type(b).__radd__(b, a) is tried next.

If only type(a) was consulted, there would be either restrictions or
_very_ strange semantics. The normal approach would result in
restrictions: I could not define a new type X that knows what it means
to "add itself to an int" on either side of the + sign. Say that N was
an instance of said new type X: then, trying 23+N (or operator.add(23,
N), same thing) would call int.__add__(23, N), but int being an existing
type and knowing nothing whatsoever about X would have to refuse the
responsibility, so 23+N would fail. The alternative would be to make
EVERY implementation of __add__ all over the whole world responsible for
delegating to "the other guy's __radd__" in case it doesn't know what to
do -- besides the boilerplate of all those 'return
other.__radd__(self)', this also means hardwiring every single detail of
the semantics of addition forevermore -- no chance to add some other
possibility or enhancement in the future, ever, without breaking
backwards compatibility each and every time. If that had been the path
taken from day one, we could NOT "blissfully ignore" coercion issues,
because that was the original approach -- every single implementation of
__add__ would have to know about coercion and apply it, and it would
never be possible to remove or even de-emphasize coercion in future
versions of the language without a major backwards-incompatible jump
breaking just about all code existing out there (tens of millions of
lines of good working Python).

The issue is even stronger for some other operations, such as
comparisons. It used to be that (net of coercion etc) a<b (or
equivalently operator.lt(a, b)) meant essentially:
a.__cmp__(b) < 0

But then more specific specialmethods were introduced, such as __lt__
and __gt__ -- so, now, a<b means something like:
if hasattr(type(a), '__lt__'):
result = type(a).__lt__(a, b)
if result is not NotImplemented: return result
if hasattr(type(b), '__gt__'):
result = type(b).__gt__(b, a)
if result is not NotImplemented: return result
if hasattr(type(a), '__cmp__'):
and so on (implementation is faster, relying on 'method slots' computed
only when a type is built or modified, but, net of dynamic lookups, this
is basically an outline of the semantics).

This is a much better factoring, since types get a chance to define some
_specific_ comparisons without implying they're able to do _all_ kinds
of comparisons. And the migration from the previous semantics to the
current one, without breaking backwards compatibility, was enabled only
by the idea that specialnames are stropped as such. An existing type
might happen to define a method 'lt', say, having nothing to do with
comparisons but meaning "amount of liters" or something like that, and
that would be perfectly legitimate since there was nothing special or
reserved about the identifier 'lt'. If the special method for 'less
than' comparison was looked for with the 'lt' identifier, though, there
would be an accidental collision with the 'lt' method meaning something
quite unrelated -- and suddenly all comparisons involving instances of
that type would break, *SILENTLY!!!* (the worst kind of breakage),
returning results quite unrelated to the semantics of comparison.
*SHUDDER*.

Many of these issues have to do with operations between two objects,
often ones that can be indicated by special syntax (infix) or by calling
functions in module operator, but not always (consider 'divmod', for
example; there's no direct infix-operator syntax for that; or
three-arguments 'pow', ditto). Indeed, in good part one could see the
problem as due to the fact that Python, like most OO languages, does
SINGLE dispatching: the FIRST argument (often written to the left of a
dot) plays a special and unique role, the other arguments "just go along
for the ride" unless special precautions are taken. An OO language
based on multiple dispatching (generic functions and multimethods, for
example, like Dylan) would consider and try to match the types of ALL
arguments in the attempt to find the right specific multimethod to call
in order to compute a given generic function call [[don't confuse that
with C++'s "overloads" of functions, which are solved at compiletime,
and thus don't do _dispatching_ stricto sensu; here, we _are_ talking
about dispatching, which happens at runtime based on the runtime types
of objects, just like the single-dispatching of a C++'s "virtual"
method, the single-dispatching of Java, Python, etc etc]].

So, one might say that much of the substance of Python's approach is a
slightly ad-hoc way to introduce a modest amount of multiple dispatch in
what is basically a single-dispatch language. To some extent, there is
truth in this. However, it does not exhaust the advantages of Python's
approach. Consider, for example, the unary function copy.copy. Since
it only takes one argument, it's not an issue of single versus multiple
dispatch. Yet, it can and does try multiple possibilities, in much the
same spirit as the above sketch for operator.lt! Since copy.py is
implemented in Python, I strongly suggest you read its sources in
Python's standard library and see all it does -- out of which, calling
type(theobject).__copy__(theobject) is just one of many possibilities.

Again, not all of these possibilities existed from day one. You can
download and study just about all Python versions since 1.5.2 and maybe
earlier and see how copy.py evolved over the years -- always without
breaking backwards compatibility. I think it would be instructive. If
the names involved had not been specially stropped ones, the smooth and
backwards compatible functional enrichment could not have happened.

Of course, one MIGHT single out some specific functionality and say
"this one is and will forever remain unary (1-argument), never needed
the multiple-possibilities idea and never will, so in THIS one special
case we don't need the usual pattern of 'foo(x) tries
type(x).__foo__(x)' and we'll use x.foo() instead". Besides the risk of
misjudging ("ooops it DOES need an optional second argument or different
attempts, now what? we're hosed!"), there is the issue of introducing an
exception to the general rule -- one has to learn by rote which
operations are to be involved as x.foo() and which ones as foo(x) or
other special syntax, prefix or infix. There is one example of a
special method (of relatively recent vintage, too) implicitly invoked
but not _named_ as such, namely 'next'; Guido has publically repented of
his design choice in that case -- it's a _wart_, a minor irregularity in
a generally very regular language, and as such may hopefully be remedied
next time backwards compatibilities may be introduced (in the
transition, a few years from now, from 2.* to 3.0).

For those that are truly attached to the antiquated
C-style notation, I see no reason why method(object) and object.method()
cannot exist side-by-side if need be.

The implementation of a builtin named 'method' (or sometimes some
equivalent special syntax) is up to Python itself: what exactly it does
or try can be changed, carefully, to enhance the language without
backwards incompatibility. The implementation of a method named
'method' (or equivalently '__method__') in type(obj) is up to whoever
(normally a Python user) codes type(obj). It cannot be changed
retroactively throughout all types while leaving existing user code
backwards-compatible.

It's hard to make predictions, especially about the future, but it's
always possible that we may want to tweak the semantics of a builtin
method (or equivalent special syntax) in the future. Say it's
determined by carefully double-blind empirical studies that a common
error in Python 2.8 is for programmers to ask for len(x) where x is an
iterator (often from a built-in generator expression &c) which _does_
expose a __len__ for the sole purpose of allowing acceletation of
list(x) and similar operations; we'd like to raise a LenNotAvailable
exception to help programmers diagnose such errors. Thanks to the
existence of built-in len, it's easy; the 'len' built-in becomes:
if hasattr(type(x), '__length_not_measurable__'):
raise LenNotAvailable
if not hasattr(type(x), '__len__'):
raise TypeError
return type(x).__len__(x)
or the like. All existing user-coded type don't define the new flag
__length_not_measurable__ and thus are unaffected and stay backwards
compatible; generator expressions or whatever we want to forbid taking
the len(...) of sprout that flag, so len(x) raises when we need it to.
((or we could have a new specialmethod __len_explicitly_taken__ to call
on explicit len(x) if available, preempting normal __len__, for even
greater flexibility -- always WITH backwards compatibility; in the
needed case that specialmethod would raise LenNotAvailable itself)).

Most likely len(x) will need no such semantics change, but why preclude
the possibility? AND at the cost of introducing a gratuitous divergence
between specialmethods which will never need enhancements (or at least
will be unable to get them smoothly if they DO need them;-) and ones
which may ("richer" operations such as copying, hashing, serializing...
ones it would be definitely hubristic to tag as "will never need any
enhancement whatsoever").

Using method(object) instead of
object.method() is a throwback from Python's earlier non-object oriented
days,

Python has never had any "non-object oriented days": it's been OO from
day one. There have been changes to the object model (all of my above
discourse is predicated on the new-style OM, where the implied lookups
for specialmethods are always on type(x), while the classic OM had the
problematic feature of actual lookups on x, for example), but typical
application-level code defining and using classes and objects would look
just about the same today as in Python 1.0 (I never used that, but at
least I _did_ bother studying some history before making assertions that
may be historically unsupportable).

and something which should be phased out by p3k. Personally, I'd

Don't hold your breath: it's absolutely certain that it will not be
phased out. It's a widespread application of the "template method"
design pattern in the wider sense, a brilliant design idea, and, even
were Python to acquire multiple dispatch (not in the cards, alas), guess
what syntax sugar IS most suited to multiple-dispatch OO...? Right:
func(a, b, c). The syntax sugar typical of single-dispatch operation
gives too much prominence to the first argument, in cases in which all
arguments cooperate in determining which implementation the operation
gets dispatched to.

Now that Python has acquired a "common base for all objects" (class
object itself), it would be feasible (once backwards compatibility can
be removed) to move unary built-ins there and out of the built-in
namespace. While this would have some advantage in freeing up the
built-in namespace, there are serious issues to consider, too. Built-in
names are not 'reserved' in any way, and new ones may always be
introduced. In the general case a built-in name performs a "Template
Method" DP, so objects would not and should not _override_ that method,
but rather define the auxiliary methods that the special calls. For all
reasons already explained, the auxiliary methods' names should be
stropped (to avoid losing future possibilities of backwards compatible
language enhancement). So what would the net advantage be? The syntax
sugar of making you use one more character, obj.hash() rather than
hash(obj), while creating a new burden to explain to all and sundry that
they _shouldn't_ 'def hash' in their own classes but rather 'def
do_hash' and the like...? Add to that the sudden divergence between
one-argument operations (which might sensibly be treated this way) and
two- and three- argument ones (which should not migrate to object for
the already-mentioned issue of single vs multiple dispatch), and it
seems to me the balance tilts overwhelmingly into NOT doing it.

like to see the use of underscores in name-mangling thrown out
altogether, as they "uglify" certain code and have no practical use in a
truly OO language, but I'm sure that's a point many will disagree with
me on.

No doubt. Opinions are strongest on the issues closest to syntax sugar
and farthest away from real depth and importance; it's an application of
one of Parkinson's Laws (the amount of time devoted to debating an issue
at a board meeting is inversely proportional to the amount of money
depending on that issue, if I correctly recall the original
formulation). For me, as long as there's stropping where there SHOULD
be stropping, exactly what sugar is used for the stropping is quite a
secondary issue. If you want to name all intended-as-private attributes
private_foo rather than _foo, all hooks-for-TMDP-operations as do_hash
rather than __hash__, and so on, I may think it's rather silly, but not
an issue of life and death, as long as all the stropping kinds that can
ever possibly be needed are clearly identified. _Removing_ the
stroppings altogether, OTOH, would IMHO be a serious technical mistake.

Seriously, I don't think there's any chance of this issue changing in
Python, including not just Python 3.0, which _will_ happen one day a few
years from now and focus on simplifying things by removing historically
accumulated redundant ways to perform some operatons, but also the
mythical "Python 3000" which might or might not one day eventuate.

If you really think it's important, I suggest you consider other good
languages that may be closer to your taste, including for example Ruby
(which uses stropping, via punctuation, for totally different purposes,
such as denoting global variables vs local ones), and languages that
claim some derivation from Python (at least in syntax), such as Boo.

Alex

Hans Nowak · Oct 21, 2004

Chris said:
I realize that. My point is why? Why is the default not object.len()?
The traditional object oriented way to access an object's attribute is
as object.attribute. For those that are truly attached to the antiquated
C-style notation, I see no reason why method(object) and object.method()
cannot exist side-by-side if need be.

Python is a multi-paradigm language. It supports OO, but also
imperative and functional styles. There is nothing "antiquated" about
the len() notation, it's simply a different style.

Admittedly, it has been there from the very beginning, when not every
built-in object had (public) methods, so obj.len() was not an option
back then.

Using method(object) instead of
object.method() is a throwback from Python's earlier non-object oriented
days,

There were no such days. Python has always been object-oriented.

and something which should be phased out by p3k. Personally, I'd
like to see the use of underscores in name-mangling thrown out
altogether, as they "uglify" certain code and have no practical use in a
truly OO language, but I'm sure that's a point many will disagree with
me on.

I don't know what "truly OO" is. It appears that every language
designer has his own ideas about that. Hence, Java's OO is not the same
as C++'s, or Smalltalk's, or Self's, or CLOS's, or...

I suppose it might be clearer if one could write

def +(self, other):
...

but then again it might not. I personally don't have a problem with
__add__ and friends.

--Hans

Nicolas Fleury · Oct 21, 2004

Hans said:
I suppose it might be clearer if one could write

def +(self, other):
...

And a syntax would be needed to get that function, that's why I like so
much the Python approach. My only complain is that _somename_ would
probably have been enough instead of __somename__. In C++, all
_[A-Z_].* are reserved, reserve _.*_ would have been enough in Python IMHO.

Regards,
Nicolas

Rocco Moretti · Oct 21, 2004

Alex said:
were Python to acquire multiple dispatch (not in the cards, alas)

What's the issue with adding multiple dispatch to Python, then?

Guido-doesn't-want-it,
Technically-impractical-given-Python-as-it-is-today,
If-you-need-it,-it's-easy-enough-to-use-a-library,
Not-enough-interest-to-justify-effort,
or Something-else?

-Rocco

Josiah Carlson · Oct 21, 2004

Nicolas Fleury said:
Hans said:

I suppose it might be clearer if one could write

def +(self, other):
...

Click to expand...

And a syntax would be needed to get that function, that's why I like so
much the Python approach. My only complain is that _somename_ would
probably have been enough instead of __somename__. In C++, all
_[A-Z_].* are reserved, reserve _.*_ would have been enough in Python IMHO.

I personally like the double leading and trailing underscores. They
jump out more. Regardless, it is a little late to complain; unless one
wants to change the behavior in Py3k, but I think that most of those who
have a say in Py3k's development believe the double underscores were a
good idea.

Josiah

Jeremy Bowers · Oct 21, 2004

Is there a purpose for using trailing and leading double underscores for
built-in method names? ... a key arguing point against Python.

Yes, I've noticed this one come up a lot on Slashdot lately, more so than
whitespace issues. I haven't checked to see if it's all one guy posting it
or not; next time I think I will, it seems suspiciously sudden to me.

As an argument against Python, implicitly in favor of some other language,
it boggles my mind. Whitespace complaints made before even trying Python I
can at least make some sense of; there are enough people unwilling to even
try something different unless your new language is an exact clone of the
one they already know that you'll hear from them. (Note: This is as
opposed to the small-but-real group of people who actually *try* it and
don't like it; those people I can respect, even as I disagree with them.)

But what mystical language exists that uses less punctuation than Python?
I've tried to come up with it, and even the obscure ones I can come up
with use more. (The closest one to Python I know is one called UserTalk,
in Frontier, but that one looses due to its use of addresses and
dereferencing, which adds enough punctuation to be harder to read. Also,
it doesn't *have* classes, so no method wierdness.)

Complaining about them in the sense of "I think we could improve Python if
we drop the underscores" makes sense; again, I'm not talking about that
sense, nor am I trying to make that argument. But as some sort of reason
to not use Python? Riiiiiiiiiight... you might as well just come out and
admit you don't *want* to try it. There's nothing wrong with that, you
know.

Also, since I'm sick of hearing about this, and I intend to use a link to
this post via Google News as a standin for repeating this argument again
somewhere else, here is a metaclass that will make the "ugly underscores"
go away. But look at that list in the variable "MagicMethods"... are you
really sure you're willing to give all of those up? Of course, you can
selectively use the metaclass, but then you have inconsistancy in your
program. But whatever...

Also note the list of methods dwarfs the actual code it took to do this.

-------------------------

"""I'm sick of hearing people bitch about the underscores. Here, let
me 'fix' that for you.

You may need to add magic method names to the set if you implement
other protocol-based things, like in Zope. I did add the Pickle
protocol in because I know where to find it.

Set your class's metaclass to UnUnderscore, and name your methods
'cmp' or 'nonzero' instead of '__cmp__' or '__nonzero__'; see the
example in the sample code."""

import sets

__all__ = ['UnUnderscore']

MagicMethods = sets.Set(('init', 'del', 'repr', 'str', 'lt', 'le',
'eq', 'ne', 'gt', 'ge', 'cmp', 'rcmp',
'hash', 'nonzero', 'unicode', 'getattr',
'setattr', 'delattr', 'getattribute', 'get',
'set', 'delete', 'call', 'len', 'getitem',
'setitem', 'delitem', 'iter', 'contains',
'add', 'sub', 'mul', 'divmod', 'pow',
'lshift', 'rshift', 'and', 'xor', 'or',
'div', 'truediv', 'radd', 'rsub', 'rmul',
'rdiv', 'rtruediv', 'rmod', 'rdivmod',
'rpow', 'rlshift', 'rrshift', 'rand', 'rxor',
'ror', 'iadd', 'isub', 'imul', 'idiv',
'itruediv', 'ifloordiv', 'imod', 'ipow',
'ilshift', 'irshift', 'iand', 'ixor', 'ior',
'neg', 'pos', 'abs', 'invert', 'complex',
'int', 'long', 'float', 'oct', 'hex',
'coerce',

# Pickle
'getinitargs', 'getnewargs', 'getstate',
'setstate', 'reduce', 'basicnew'))

class UnUnderscore(type):
"""See module docstring."""
def __init__(cls, name, bases, dict):
super(UnUnderscore, cls).__init__(name, bases, dict)
for method in MagicMethods:
if hasattr(cls, method):
setattr(cls, "__" + method + "__",
getattr(cls, method))
delattr(cls, method)

if __name__ == "__main__":
# simple test
class Test(object):
__metaclass__ = UnUnderscore
def init(self):
self.two = 2
def len(self):
return 3
t = Test()
if len(t) == 3 and t.two == 2:
print "Works, at least a little."
else:
print "Not working at all."

Ville Vainio · Oct 21, 2004

Rocco> Alex Martelli wrote:

Rocco> What's the issue with adding multiple dispatch to Python, then?

....

Rocco> If-you-need-it,-it's-easy-enough-to-use-a-library,

Isn't that enough?

Scott David Daniels · Oct 21, 2004

Josiah said:
.... Regardless, it is a little late to complain; ....

Unless you borrow the keys to the time machine. However, I don't
think Guido would loan them out for this reason.

-Scott David Daniels
(e-mail address removed)

Lonnie Princehouse · Oct 21, 2004

It's extremely common for Python newbies to accidentally overwrite
the names of important things. I see stuff like this all the time:

list = [1,2,3]
str = "Hello world"

This sort of accidental trampling would be even more frequent without
the underscores.

And, to play devil's advocate, there are probably a dozen ways to
hack around the underscores, for those who don't like them:

class __silly__(type):
def __new__(cls, name, bases, dct):
# incomplete list - just enough for a demo
magic_functions = ['init','len','str']
for f in [x for x in magic_functions if x in dct]:
dct['__%s__' % f] = dct[f]
return type.__new__(cls, name, bases, dct)
__metaclass__ = __silly__

class Bar:
def init(self):
print "init Bar instance"
def str(self):
return "Bar str method"
def len(self):
return 23

f = Bar()

Andrew Dalke · Oct 21, 2004

Nicolas said:
And a syntax would be needed to get that function, that's why I like so
much the Python approach. My only complain is that _somename_ would
probably have been enough instead of __somename__. In C++, all
_[A-Z_].* are reserved, reserve _.*_ would have been enough in Python IMHO.

Though I liked how the win32 code uses '_.*_' for its
special properties. It's 1/2 way between user- and
system- space so it was a very nice touch.

Andrew
(e-mail address removed)

Alex Martelli · Oct 21, 2004

Jeremy Bowers said:
But what mystical language exists that uses less punctuation than Python?

Applescript, maybe. 'tell foo of bar to tweak zippo' where python would
have bar.foo.tweak(zippo), looks like. (I'm not enthusiastic about the
idea, just pointing it out!-).

Alex

Jeremy Bowers · Oct 22, 2004

Applescript, maybe. 'tell foo of bar to tweak zippo' where python would
have bar.foo.tweak(zippo), looks like. (I'm not enthusiastic about the
idea, just pointing it out!-).

Point. Maybe Cobol on the same theory? I don't know, I've never used
either.

I guess if you're so stuck on the double-underscore-is-too-much-
punctuation idea, you *deserve* to try to do all your programming in a
combo of COBOL and Applescript

I'm thinking I'll stick with Python.

John Roth · Oct 22, 2004

Jeremy Bowers said:
But what mystical language exists that uses less punctuation than Python?
I've tried to come up with it, and even the obscure ones I can come up
with use more.

Forth. Postscript.

John Roth

PyWart: PEP8: a seething cauldron of inconsistencies.	1	Jul 28, 2011
PyWart: PEP8: A cauldron of inconsistencies.	7	Jul 27, 2011
word_set = set() def should_preceed_with_an(phrase): first_word =	1	Jan 26, 2013
For Peer Review	1	Apr 2, 2010
Python style: to check or not to check args and data members	18	Sep 1, 2006
python-dev summary for 2005-07-01 to 2005-07-15	1	Jul 31, 2005
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 12, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Mar 15, 2008

Method Underscores?

Chris S.

Josiah Carlson

Alex Martelli

Chris S.

Andrew Dalke

Ville Vainio

Andrew Dalke

Alex Martelli

Hans Nowak

Nicolas Fleury

Rocco Moretti

Josiah Carlson

Jeremy Bowers

Ville Vainio

Scott David Daniels

Lonnie Princehouse

Andrew Dalke

Alex Martelli

Jeremy Bowers

John Roth

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads