Why do class methods always need 'self' as the first parameter?

T

T. Goodchild

I’m new to Python, and I love it. The philosophy of the language (and
of the community as a whole) is beautiful to me.

But one of the things that bugs me is the requirement that all class
methods have 'self' as their first parameter. On a gut level, to me
this seems to be at odds with Python’s dedication to simplicity.

For example, consider Python’s indent-sensitive syntax. Although
other languages didn’t use indentation to specify scope, programmers
always used indentation anyways. Making indentation took a common
practice, made it a rule, and the result was a significantly improved
signal-to-noise ratio in the readability of Python code.

So why is 'self' necessary on class methods? It seems to me that the
most common practice is that class methods *almost always* operate on
the instance that called them. It would make more sense to me if this
was assumed by default, and for "static" methods (methods that are
part of a class, but never associated with a specific instance) to be
labelled instead.

Just curious about the rationale behind this part of the language.
 
J

John Gordon

In said:
So why is 'self' necessary on class methods? It seems to me that the
most common practice is that class methods *almost always* operate on
the instance that called them. It would make more sense to me if this
was assumed by default, and for "static" methods (methods that are
part of a class, but never associated with a specific instance) to be
labelled instead.
Just curious about the rationale behind this part of the language.

How would a method access instance variables without 'self'?

They probably could have made 'self' a magical attribute that just
appears out of thin air instead of being passed as an argument, like
'this' in C++. But would that really provide any benefit?
 
N

Neil Cerutti

I?m new to Python, and I love it. The philosophy of the
language (and of the community as a whole) is beautiful to me.

But one of the things that bugs me is the requirement that all
class methods have 'self' as their first parameter. On a gut
level, to me this seems to be at odds with Python?s dedication
to simplicity.

Think it through carefully, and you'll probably agree with
Python's design. But not necessarily.

In any case, this is a very common complaint, so check out the
Python FAQ.

http://docs.python.org/faq/design.html#why-self
 
J

Javier Collado

Hello,

2011/8/31 T. Goodchild said:
But one of the things that bugs me is the requirement that all class
methods have 'self' as their first parameter.  On a gut level, to me
this seems to be at odds with Python’s dedication to simplicity.

I think the answer to this question is part of the zen of python:
<<Explicit is better than implicit.>>

http://www.python.org/dev/peps/pep-0020/

Regards,
Javier
 
S

Steven D'Aprano

T. Goodchild said:
So why is 'self' necessary on class methods?

I assume you are talking about the declaration in the method signature:

def method(self, args): ...

rather than why methods have to be called using self.method. If not, there's
already a FAQ for that second question:

http://docs.python.org/faq/design.html#why-self

It seems to me that the
most common practice is that class methods *almost always* operate on
the instance that called them.

By the way, what you're calling "class methods" are actually *instance*
methods, because they receive the instance "self" as the first parameter.

Python does have class methods, which receive the class, not the instance,
as the first parameter. These are usually written something like this:

class K(object):
@classmethod
def spam(cls, args):
print cls # always prints "class K", never the instance

Just like self, the name cls is a convention only. Class methods are usually
used for alternate constructors.

There are also static methods, which don't receive any special first
argument, plus any other sort of method you can invent, by creating
descriptors... but that's getting into fairly advanced territory. They're
generally specialised, and don't see much use.

As you can see, the terminology is not exactly the same as Java.

It would make more sense to me if this
was assumed by default, ...

Well here's the thing. Python methods are wrappers around function objects.
The method wrapper knows which instance is involved (because of the
descriptor magic which I alluded to above), but the function doesn't and
can't. Or at least not without horrible run-time hacks.

By treating "self" as an ordinary parameter which needs to be declared, you
can do cool stuff like bound and unbound methods:

f = "instance".upper # this is a bound method
g = str.upper # this is an unbound method

The bound method f already has the instance "self" filled in, so to speak.
So you can now just call it, and it will work:

f()
=> returns "INSTANCE"

The unbound method still needs the instance supplied. This makes it perfect
for code like this:

for instance in ("hello", "world"):
print g(instance)

especially if you don't know what g will be until run-time. (E.g. will it be
str.upper, str.lower, str.title?)

Because methods require that first argument to be given explicitly, unbound
methods are practically ordinary functions. They're so like functions that
in Python 3, they're done away with altogether, and the unwrapped function
object will be returned instead.

You can also do nifty stuff like dynamic method injections:
.... print(a, b)
........ pass
....(<__main__.K object at 0xb7f0a4cc>, 23)

and it all just works. You can even inject a method onto the instance,
although it takes a bit more effort to make that work.

All this is possible without nasty hacks because self is treated as just an
ordinary parameter of functions. Otherwise, the compiler would need to know
whether the function was being called from inside a method wrapper or not,
and change the function signature appropriately, and that just gets too
ugly and messy for words.

So for the cost of having to declare self as an argument, we get:

* instant visual recognition of what's intended as a method ("the
first argument is called self") and what isn't
* a nicely consistent treatment of function signatures at all times
* clean semantics for the local variable namespace
* the same mechanism (with minor adjustments) can be used for class
and static methods
* bound and unbound methods semantics

plus as a bonus, plenty of ongoing arguments about whether or not having to
explicitly list "self" as a parameter is a good thing or not, thus keeping
people busy arguing on mailing lists instead of coding

<wink>
 
S

Steven D'Aprano

John said:
In <[email protected]> "T.



How would a method access instance variables without 'self'?

If Python had compile time declarations, the compiler could know whether x=1
was referring to a local variable x or an attribute x.

The reader might not, but the compiler would :)


By the way, although the Python docs are a little inconsistent, the usual
term here is "attribute" rather than "instance variable".

Attributes need not live on the instance: they can also live on the class, a
superclass, or be computed at run-time via at least three different
mechanisms I can think of (__getattribute__, __getattr__, properties).
Local variables are treated a bit differently from attributes, but broadly
speaking, if you need a dot to access something, it's an attribute, if you
don't, it's a name binding (or variable).

Python even has two different sorts of errors for "variable" lookup
failures: NameError (or UnboundLocalError) for un-dotted names, and
AttributeError for dotted names.

They probably could have made 'self' a magical attribute that just
appears out of thin air instead of being passed as an argument, like
'this' in C++. But would that really provide any benefit?

Well obviously the C++ people thought so :)

The effort to type "self, " in method signatures is pretty low. I don't
think it is a problem. But other languages are free to choose differently.
Cobra, for example, is explicitly derived from Python in many ways, but it
drops the "self" (as well as other changes).

http://cobra-language.com/docs/python/
 
G

Grant Edwards

Well obviously the C++ people thought so :)

Well _that's_ certainly a ringing endorsement in the context of
designing a language that's easy to understand and use.


;)
 
T

Terry Reedy

But one of the things that bugs me is the requirement that all class
methods have 'self' as their first parameter. On a gut level, to me
this seems to be at odds with Python’s dedication to simplicity.

Actually, it is a consequence of Python's dedication to simplicity. A
method is simply a function that is an attribute of a class. (This is
even clearer in Py 3.) Hence, there is no special syntax for methods.

Consider

def double(obj): return 2*obj.value

class C:
def __init__(self, val):
self.value = val

c = C(3)
C.double = double
c.doub = double
# not c.double as that would mask access to C.double in c.double() below
print(double(c), C.double(c), c.double(), c.doub(c))
# 6 6 6 6
 
C

Chris Torek

[Comprehensive reply, noting that these are actually instance
methods, and that there are class and static methods as well]:
Python does have class methods, which receive the class, not the instance,
as the first parameter. These are usually written something like this:

class K(object):
@classmethod
def spam(cls, args):
print cls # always prints "class K", never the instance

Just like self, the name cls is a convention only. Class methods are usually
used for alternate constructors.

There are also static methods, which don't receive any special first
argument, plus any other sort of method you can invent, by creating
descriptors... but that's getting into fairly advanced territory. ...
[rest snipped]

I am not sure whether T. Goodchild was asking any of the above or
perhaps also one other possible question: if an instance method
is going to receive an automatic first "self" parameter, why require
the programmer to write that parameter in the "def"? For instance
we *could* have:

class K(object):
def meth1(arg1, arg2):
self.arg1 = arg1 # self is "magically available"
self.arg2 = arg2

@classmethod
def meth2(arg):
use(cls) # cls is "magically available"

and so on. This would work fine. It just requires a bit of implicit
sneakiness in the compiler: an instance method magically creates
a local variable named "self" that binds to the invisible first
parameter, and a class method magically creates a local variable
named "cls" that binds to the invisible first parameter, and so
on.

Instead, we have a syntax where you, the programmer, write out the
name of the local variable that binds to the first parameter. This
means the first parameter is visible. Except, it is only visible
at the function definition -- when you have the instance and call
the instance or class method:

black_knight = K()
black_knight.meth1('a', 1)
black_knight.meth2(2)

the first parameters (black_knight, and black_knight.__class__,
respectively) are magic, and invisible.

Thus, Python is using the "explicit is better than implicit" rule
in the definition, but not at the call site. I have no problem with
this. Sometimes I think implicit is better than explicit. In this
case, there is no need to distinguish, at the calls to meth1() and
meth2(), as to whether they are "class" or "instance" methods. At
the *calls* they would just be distractions.

At the *definitions*, they are not as "distraction-y" since it is
important to know, during the definition, whether you are operating
on an instance (meth1) or the class itself (meth2), or for that
matter on neither (static methods). One could determine this from
the absence or presence of "@classmethod" or "@staticmethod", but
the minor redundancy in the "def" statement seems, well, minor.

Also, as a bonus, it lets you obfuscate the code by using a name
other than "self" or "cls". :)
 
I

Ian Kelly

It seems to me that if I add a function to the list of class attributes it will automatically wrap with "self" but adding it to the object directly will not wrap the function as a method. Can somebody explain why? I would have thought that any function added to an object would be a method (unless decorated as a class method).

Because things stored on the class are generally viewed as part of the
class definition, whereas things stored on an instance are generally
viewed as data -- a function stored on an object instance is usually
just meant to be a function. Consider the following code:

class Sorter(object):
def __init__(self, keyfunc):
self.keyfunc = keyfunc
def sort(self, item_list):
item_list.sort(key=self.keyfunc)

sorter = Sorter(lambda x: x.id)
sorter.sort(some_list_of_items)

If adding keyfunc as an attribute to the object wrapped it up as a
method, it would break, since the function is not expecting a "self"
argument.

More technically, because descriptors are only invoked when they're
stored on the class.
Hmm, or does the decoration just tell Python not to turn an object's function into a method? I.e. Is the decorator basically just the syntactic sugar for doing the above?

If you mean the staticmethod decorator, yes, it pretty much just wraps
the function as a "staticmethod" instance to prevent it from being
wrapped into an ordinary method when it's accessed.

Cheers,
Ian
 
T

Terry Reedy

Above is 3.2 code. To be exactly equivalent with 2.x, you need
class C(object):
Sorry if I get some of the following terminology wrong, I get a bit
confused on Python terms. I hope the following is still coherent. (Is
there a dictionary of Python terminology?)
Given the above example I get this
TypeError: double() takes exactly 1 argument (2 given)

Right, because c.double() translates to C.double(c), and c.double(x)
translates to C.double(c,x), which is not valid.
6

It seems to me that if I add a function to the list of class
attributes it will automatically wrap with "self"

When accessed via an instance of the class, the instance is
automagically added as the first argument to be bound to the first
parameter. The name 'self' is a convention, not a requirement.
but adding it to
the object directly will not wrap the function as a method. Can
somebody explain why?

Someone else did. Not wrapping is normal, wrapping is a special case.
 
S

Steven D'Aprano

Chris said:
There are also static methods, which don't receive any special first
argument, plus any other sort of method you can invent, by creating
descriptors... but that's getting into fairly advanced territory. ...
[rest snipped]

I am not sure whether T. Goodchild was asking any of the above or
perhaps also one other possible question: if an instance method
is going to receive an automatic first "self" parameter, why require
the programmer to write that parameter in the "def"?

Er, yes, just like I suggested in my opening paragraph, and as I answered
following the bit you marked as snipped :)

For instance
we *could* have:

class K(object):
def meth1(arg1, arg2):
self.arg1 = arg1 # self is "magically available"
self.arg2 = arg2

@classmethod
def meth2(arg):
use(cls) # cls is "magically available"

and so on. This would work fine. It just requires a bit of implicit
sneakiness in the compiler: an instance method magically creates
a local variable named "self" that binds to the invisible first
parameter, and a class method magically creates a local variable
named "cls" that binds to the invisible first parameter, and so
on.

It would need more than "a bit", because methods are just wrappers around
functions. One way would be for Python to give that up, and require methods
to be special built-in types like functions. That adds complexity to the
compiler, and (very likely) would decrease the level of dynamism possible.

Another way would be for the compiler to perform darkest black magic to
determine whether the function was being called from inside a method or
not. That would be complicated and fragile.

[...]
At the *definitions*, they are not as "distraction-y" since it is
important to know, during the definition, whether you are operating
on an instance (meth1) or the class itself (meth2), or for that
matter on neither (static methods). One could determine this from
the absence or presence of "@classmethod" or "@staticmethod"

classmethod and staticmethod are functions, not declarations. You can't
assume that @classmethod is the only way to get a class method: the
metaclass could do it, or you could inject one in from the outside. You can
dynamically change the state of a method from instance method to class
method and back again at run-time.

Python classes have a lot of dynamism made possible by the fact that methods
are just wrappers around functions with an explicitly declared "self". That
dynamism is rarely used, but not *that* rarely, and is very useful when
used. Implicit self would likely negate all that.
 
C

Chris Angelico

Python classes have a lot of dynamism made possible by the fact that methods
are just wrappers around functions with an explicitly declared "self". That
dynamism is rarely used, but not *that* rarely, and is very useful when
used. Implicit self would likely negate all that.

Hmm. Got any examples sitting around? I'm curious as to what you can
do with this. I'm like a kid with a new chemistry set - "what happens
if I mix a little of everything together?"...

ChrisA
 
E

Eric Snow

Hmm. Got any examples sitting around? I'm curious as to what you can
do with this. I'm like a kid with a new chemistry set - "what happens
if I mix a little of everything together?"...

First thing that comes to mind is calling a base class's
implementation of a method:

class X(Y):
def __init__(self, value):
Y.__init__(self)
self.value = value

-eric
 
C

Chris Torek

Er, yes, just like I suggested in my opening paragraph, and as I answered
following the bit you marked as snipped :)

Oops, so you did (went back and re-read it). Must have gotten
interrupted and lost track. :)
[A different hack would] requires a bit of implicit
sneakiness in the compiler: an instance method magically creates
a local variable named "self" that binds to the invisible first
parameter, and a class method magically creates a local variable
named "cls" that binds to the invisible first parameter, and so
on.
It would need more than "a bit", because methods are just wrappers
around functions.

Well, depends on how the hack would be done. :) For instance,
the @decorator might turn on something that "undoes" or "replaces"
the "self" parameter. That is, with ordinary class functions and
methods:

class HackyNotQuitePythonVersion:
def ordinary(arg):
self.arg = arg

would compile to (approximately):

class PythonVersion:
def __mrap(self, *args, **kwargs):
def ordinary(arg):
self.arg = arg
ordinary(*args, **kwargs)
ordinary = __mrap

(add the usual other manipulations to suit here, i.e., all the
stuff for making introspection work right, i.e., @functools.wraps).
@staticmethod would suppress the wrapper entirely, while @classmethod
would change it to one that binds the "cls" argument. (Any function
without some appropriate @whatever gets the Method Wrapper __mrap.
@staticmethod tells the class builder not to add any wrapper, and
@classmethod tells it to add the Class Wrapper __crap. [The name
tells you what I think of the above code. :) ])

(Note subtle ground for bugs here: if you then actually define a
"self" parameter, it shadows the outer-scope one from the wrapper.
So while I am not even proposing that anyone should do this in the
first place, it has more downsides than mere implementation
complexity.)
Another way would be for the compiler to perform darkest black magic to
determine whether the function was being called from inside a method or
not. That would be complicated and fragile.

Yes, even worse than my outlined implementation above, I think.
classmethod and staticmethod are functions, not declarations.

They are decorator functions, but to someone *reading the code*
they are also "declarations" of sort. This is all I meant: they
tell the (human) reader/programmer which "secret arguments" to
expect.
You can't assume that @classmethod is the only way to get a
class method: the metaclass could do it, or you could inject
one in from the outside.

Yes, but that would all still work, as in this not-quite-Python
(worsened-Python) language, whoever writes those metaclasses and
other decorators would continue to do whatever icky stuff was
required (e.g., __mrap and __crap above). It would mean yet more
things for people to know about, but then, metaclasses and decorators
*always* mean that:

@hmm
def spam():
return magic

Is "magic" something supplied by the decorator? You have to look
at the decorator to find out, as the rather horrid example I have
attached shows.

(Note: I am doing all this is python 2.x on the laptop. Using
global, in @hmm, is evil, but it works. I did not bother trying
to write a metaclass that inserts __mrap, etc., but I believe it
can be done.)
Python classes have a lot of dynamism made possible by the fact that methods
are just wrappers around functions with an explicitly declared "self". That
dynamism is rarely used, but not *that* rarely, and is very useful when
used. Implicit self would likely negate all that.

I do not believe it would *negate* it, just *complicate* it. But
that is not a good thing either. :)

----- horrible example / test code below
import functools
def hmm(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
global magic, rlevel
try:
save = magic, rlevel
restore = True
rlevel += 1
except NameError:
restore = False
rlevel = 1
magic = func.__name__ + " and eggs"
ret = func(*args, **kwargs)
if restore:
magic, rlevel = save
else:
del magic, rlevel
return ret
return wrapper

@hmm
def ham():
if rlevel < 2:
print spam()
return magic
@hmm
def spam():
return magic

print ham()
try:
print magic
except NameError:
print 'name "magic" is not available here, as desired'
try:
print rlevel
except NameError:
print 'name "rlevel" is not available here, as desired'

class X(object):
def __mrap(self, *args, **kwargs):
def xset(arg):
self.arg = arg
xset(*args, **kwargs)
xset = __mrap
def __mrap(self, *args, **kwargs):
def show():
print self.arg
show(*args, **kwargs)
show = __mrap

x = X()
x.xset('value')
x.show()
 
U

UncleLaz

I’m new to Python, and I love it.  The philosophy of the language (and
of the community as a whole) is beautiful to me.

But one of the things that bugs me is the requirement that all class
methods have 'self' as their first parameter.  On a gut level, to me
this seems to be at odds with Python’s dedication to simplicity.

For example, consider Python’s indent-sensitive syntax.  Although
other languages didn’t use indentation to specify scope, programmers
always used indentation anyways.  Making indentation took a common
practice, made it a rule, and the result was a significantly improved
signal-to-noise ratio in the readability of Python code.

So why is 'self' necessary on class methods?  It seems to me that the
most common practice is that class methods *almost always* operate on
the instance that called them.  It would make more sense to me if this
was assumed by default, and for "static" methods (methods that are
part of a class, but never associated with a specific instance) to be
labelled instead.

Just curious about the rationale behind this part of the language.

It's required to make distinction between objects inside the calss and
outside of it. Seems pretty logical to me.
 
M

Michiel Overtoom

When instance variables are accessed with the 'self.varname' syntax, it is clear to the programmer that an instance variable is accessed, and not some global. Other languages have weird syntax conventions like that you have to prepend all instance attributes with an '@', and in languages like C++ where there is not necessarily such a syntactic requirement, many programmers use ad-hoc constructs like '_varname' or 'm_varname' to make the distinction clear.


Yes, you have a point there. My personal preference would be to optimize for the most common case, while exceptions to the norm are still possible, but perhaps a bit more verbose.

Greetings
 
J

John Roth

I’m new to Python, and I love it.  The philosophy of the language (and
of the community as a whole) is beautiful to me.

But one of the things that bugs me is the requirement that all class
methods have 'self' as their first parameter.  On a gut level, to me
this seems to be at odds with Python’s dedication to simplicity.

For example, consider Python’s indent-sensitive syntax.  Although
other languages didn’t use indentation to specify scope, programmers
always used indentation anyways.  Making indentation took a common
practice, made it a rule, and the result was a significantly improved
signal-to-noise ratio in the readability of Python code.

So why is 'self' necessary on class methods?  It seems to me that the
most common practice is that class methods *almost always* operate on
the instance that called them.  It would make more sense to me if this
was assumed by default, and for "static" methods (methods that are
part of a class, but never associated with a specific instance) to be
labelled instead.

Just curious about the rationale behind this part of the language.

I personally consider this to be a wart. Some time ago I did an
implementation analysis. The gist is that, if self and cls were made
special variables that returned the current instance and class
respectively, then the compiler could determine whether a function was
an instance or class method. If it then marked the code object
appropriately you could get rid of all of the wrappers and the
attendant run-time overhead.

I've never published the analysis because that train has already left
the shed. The earliest it could be considered would be 4.0, which
isn't even on the horizon.

John Roth
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top