Inheritance and name clashes

R

Rock

Hi all :)

I've really been wondering about the following lately. The question is
this: if there are no (real) private or protected members in Python,
how can you be sure, when inheriting from another class, that you
won't wind up overriding, and possibly clobbering some important data
field of the parent class, which might compromise its entire
functionality?

I mean, nevermind the double underscore business, I know all about it.
But, honestly, not everybody uses that, so you can't really be sure
about what you're doing, right? Maybe the author forgot to warn about
some special member in the docs for instance, or even worse, it's a
third-party library, perhaps with no source! So how can you be sure???
The way I see it ... you can't!

Am I wrong?

Please give me a hand on this one :)

Rock
 
G

Gary Herron

Hi all :)

I've really been wondering about the following lately. The question is
this: if there are no (real) private or protected members in Python,
how can you be sure, when inheriting from another class, that you
won't wind up overriding, and possibly clobbering some important data
field of the parent class, which might compromise its entire
functionality?

I mean, nevermind the double underscore business, I know all about it.
But, honestly, not everybody uses that, so you can't really be sure
about what you're doing, right? Maybe the author forgot to warn about
some special member in the docs for instance, or even worse, it's a
third-party library, perhaps with no source! So how can you be sure???
The way I see it ... you can't!

Am I wrong?

No, you are correct.

But the philosophy of Python is "We're all consenting adults here".

I won't willingly allow someone else to control what I have access to.
And I assume that if I do step on something important, I'll figure it
out during testing, long before releasing any code.

Gary Herron
 
S

Steve Howell

Hi all :)

I've really been wondering about the following lately. The question is
this: if there are no (real) private or protected members in Python,
how can you be sure, when inheriting from another class, that you
won't wind up overriding, and possibly clobbering some important data
field of the parent class, which might compromise its entire
functionality?

I mean, nevermind the double underscore business, I know all about it.
But, honestly, not everybody uses that, so you can't really be sure
about what you're doing, right? Maybe the author forgot to warn about
some special member in the docs for instance, or even worse, it's a
third-party library, perhaps with no source! So how can you be sure???
The way I see it ... you can't!

Am I wrong?

Please give me a hand on this one :)

The nice thing about Python (and other duck-typed languages) is that
it doesn't force you to use inheritance when you simply want two
classes to have the same API. If you have two classes with similar
APIs but different internal implementations, just code separate
classes.

Inheritance can be useful when you want class B to leverage the
internal details of class A. Obviously, if you choose this strategy,
you need to have control over the internal implementation of class A.

If you are designing a class that you intend to be extended through
inheritance, one way to prevent naive subclasses from poking into your
internals is to create useful APIs within the superclass to change the
behavior. But instead of focusing on making inheritance work at any
cost, the pitfalls of inheritance can often point you to better
designs. If your class needs to be subclassed in many different ways
to be useful, it's possible that it's not carrying its own weight, and
you just need to be the superclass more flexible. Or you can have
quite the opposite problem--you have two classes in an inheritance
structure that are trying to do too much, when a simpler design might
have three or four non-coupled classes that each do one job well.

Object-oriented designs are difficult to design in any programming
language, and it helps to have some sort of concrete problem to drive
the discussion. Are you working on a particular design where you
think Python's philosophy will inhibit good design? My take on Python
is that it focuses more on enabling good designs than preventing bad
designs. I prefer this to Java, for example, which I feel inhibits me
from expressiveness at a higher cost than any supposed benefits
private/protected would give me.
 
R

Rock

Object-oriented designs are difficult to design in any programming
language, and it helps to have some sort of concrete problem to drive
the discussion.  Are you working on a particular design where you
think Python's philosophy will inhibit good design?  My take on Python
is that it focuses more on enabling good designs than preventing bad
designs.  I prefer this to Java, for example, which I feel inhibits me
from expressiveness at a higher cost than any supposed benefits
private/protected would give me.

Thanks for the reply. No, I was just working with a normal library
class which was supposed to be derived. So that's what I did, but in
the process I found myself needing to create an instance variable and
it dawned on me: "how do I know I'm not clobbering something
here???" ... I'd have to look at the docs, right? But I still wasn't
sure ... so, then I thought "let's look at the source", and then I
found out. But! It took me some time to make sure, and I was puzzled
as well. I mean, what if I have no source to look at? What if the
library I'm using doesn't realase the source, or what if I just can't
get my hands on it for some reason or another?

That was a big disappointment with Python for sure. Somehow PHP makes
me feel a little safer, in that respect at least.
 
A

Arnaud Delobelle

Rock said:
Thanks for the reply. No, I was just working with a normal library
class which was supposed to be derived. So that's what I did, but in
the process I found myself needing to create an instance variable and
it dawned on me: "how do I know I'm not clobbering something
here???" ... I'd have to look at the docs, right? But I still wasn't
sure ... so, then I thought "let's look at the source", and then I
found out. But! It took me some time to make sure, and I was puzzled
as well. I mean, what if I have no source to look at? What if the
library I'm using doesn't realase the source, or what if I just can't
get my hands on it for some reason or another?

That was a big disappointment with Python for sure. Somehow PHP makes
me feel a little safer, in that respect at least.

I've been reading c.l.python for years (on and off) and I can't recall
anybody saying this has been a problem in practise. There are many
queries about how to create private/protected attributes, but not as a
result of having had a problem, rather because people new to Python but
experienced in another language (C++, Java...) are used to those
concepts being central, whereas in Python they are deliberately being
pushed aside.

A much more common problem with naming is something like this: I
create a module (or simply a script) and calls it "foo.py". In another
file in the same directory I create a script that imports module "bar".
However, module "bar" itself imports a module called "foo" of which I am
unaware. But because "foo.py" is in the Python path, this gets loaded
instead, causing havoc.
 
A

Arnaud Delobelle

Arnaud Delobelle said:
I've been reading c.l.python for years (on and off) and I can't recall
anybody saying this has been a problem in practise.

Arghh! Practice, I meant practice!
 
S

Steve Howell

Thanks for the reply. No, I was just working with a normal library
class which was supposed to be derived. So that's what I did, but in
the process I found myself needing to create an instance variable and
it dawned on me: "how do I know I'm not clobbering something
here???" ... I'd have to look at the docs, right? But I still wasn't
sure ... so, then I thought "let's look at the source", and then I
found out. But! It took me some time to make sure, and I was puzzled
as well. I mean, what if I have no source to look at? What if the
library I'm using doesn't realase the source, or what if I just can't
get my hands on it for some reason or another?

That was a big disappointment with Python for sure. Somehow PHP makes
me feel a little safer, in that respect at least.

One workaround is to save all your state under a single object like a
dictionary. Or maybe you can avoid saving state altogether.
 
G

Gregory Ewing

Rock said:
What if the
library I'm using doesn't realase the source, or what if I just can't
get my hands on it for some reason or another?

You can always use dir() on an instance of the class to
find out what names it's using.
 
S

Steve Howell

You can always use dir() on an instance of the class to
find out what names it's using.

Indeed but the OP should be aware that dir() only reflects the current
state of the object.
 
P

Paul Rubin

Arnaud Delobelle said:
I've been reading c.l.python for years (on and off) and I can't recall
anybody saying this has been a problem in practise.

It has been a problem for me at least once. I blew a good chunk of a
day debugging a problem that turned out due to my clobbering something
in the socket module because of a name collision. It's not that
frequent an occurrence in the scheme of things, but IMO it would be
nicer if modules and classes could be explicit about what they imported
and exported.
 
C

Carl Banks

No, I was just working with a normal library
class which was supposed to be derived. So that's what I did, but in
the process I found myself needing to create an instance variable and
it dawned on me: "how do I know I'm not clobbering something
here???" ... I'd have to look at the docs, right? But I still wasn't
sure ... so, then I thought "let's look at the source", and then I
found out. But! It took me some time to make sure, and I was puzzled
as well. I mean, what if I have no source to look at? What if the
library I'm using doesn't realase the source, or what if I just can't
get my hands on it for some reason or another?

That was a big disappointment with Python for sure. Somehow PHP makes
me feel a little safer, in that respect at least.

Name collisions are only one of several pitfalls that can happen when
you subclass a third-party clasa. Another pitfall is uncertainly over
which methods a certain behavior is implemented, which is something
you can't determine from the interface alone, in any language I'm
aware of. If you want to override such a behavior you have to look at
the class's implementation to find out how. Point is, if Python
corrected this "defect", you still wouldn't be "safe" because there
are other dangers, which exist in other langauges, too.

Now, if a class is specifically designed to be subclassed by third-
party users, and it's not using name mangling or some other way to
avoid name collisions, then I would call it defective.


Carl Banks
 
M

MRAB

Indeed but the OP should be aware that dir() only reflects the current
state of the object.
Could something like pyflakes tell you what private attributes there
are, basically looking for self._foo or whatever?
 
S

Steve Howell

Name collisions are only one of several pitfalls that can happen when
you subclass a third-party clasa.  Another pitfall is uncertainly over
which methods a certain behavior is implemented, which is something
you can't determine from the interface alone, in any language I'm
aware of.  If you want to override such a behavior you have to look at
the class's implementation to find out how.  Point is, if Python
corrected this "defect", you still wouldn't be "safe" because there
are other dangers, which exist in other langauges, too.

Now, if a class is specifically designed to be subclassed by third-
party users, and it's not using name mangling or some other way to
avoid name collisions, then I would call it defective.

Yep, and a well-designed library can go a long way toward avoiding
name collisions simply by providing an API that facilitates logical
extensions. If a library knows when it needs to delegate to a caller,
it can have a callback mechanism. If a library knows how subclasses
are likely to change its state, it can provide an API that lowers the
temptation to muck with internals.
 
S

Steve Howell

Could something like pyflakes tell you what private attributes there
are, basically looking for self._foo or whatever?

Using pyflakes has the same problem as dir(). It can always be
thwarted by the dynamic nature of the language.

I think it's more worthwhile to educate Python newbies on how to
choose libraries. Well-written libraries can often have tremendous
reuse without the need for subclassing in the first place, by allowing
for callbacks or rich configuration. Libraries that choose
subclassing as their primary extension mechanism should be selected by
how well they hide their own internal state and by how well they
provide a mechanism to store state in the subclasses. There are lots
of techniques for library writers to employ here--naming conventions,
delegation, readable documentation, APIs for the subclass to call back
to, etc.
 
S

Steven D'Aprano

Thanks for the reply. No, I was just working with a normal library class
which was supposed to be derived. So that's what I did, but in the
process I found myself needing to create an instance variable and it
dawned on me: "how do I know I'm not clobbering something here???"

Because then all your tests will start failing.

What's that you say? You have no tests? Then how do you know your program
does what you think it does?



[...]
That was a big disappointment with Python for sure. Somehow PHP makes me
feel a little safer, in that respect at least.

Oh my, that's hilarious! Nice one! PHP feels safer, *grins like a loon*
 
C

Chris Torek

I've really been wondering about the following lately. The question is
this: if there are no (real) private or protected members in Python,
how can you be sure, when inheriting from another class, that you
won't wind up overriding, and possibly clobbering some important data
field of the parent class, which might compromise its entire
functionality?

You are right, but this is a double-edged feature/bug as it allows
you to override stuff deliberately, including things the original
code-writer did not think you should be able to override. You
cannot get this particular benefit without a corresponding cost.

For instance (note, this may have been changed in a later version,
I am using rather old code for this particular project) I ran into
an issue with the SimpleXMLRPCServer code (really BaseHTTPServer),
which I fixed with the pipewrap() code below. I also needed a
number of extra features; see comments. To do this I had to make
use of a number of things that were not officially exported, nor
really designed to be override-able.

(I've snipped out some [I think] irrelevant code below, but left
enough to illustrate all this.)

--------

import socket, SocketServer, SimpleXMLRPCServer, xmlrpclib
import errno

def format_ipaddr(addr, do_reverse_lookup = True):
(host, port) = addr[:2]
fqdn = socket.getfqdn(host) if do_reverse_lookup else ''
return '%s[%s]' % (fqdn, host)

# For exceptions to be passed back to rpc caller (other exceptions
# are caught in the MgrServer and logged), use MgrError (or xmlrpclib.Fault).
class MgrError:
def __init__(self, exc_type = None, exc_val = None):
self.exc_type = exc_type
self.exc_val = exc_val
if exc_type is None or exc_val is None:
i = sys.exc_info()[:2]
if exc_type is None:
self.exc_type = i[0]
if exc_val is None:
self.exc_val = i[1]


# This gives us an opportunity to fix the "broken pipe" error that
# occurs when a client disconnects in mid-RPC.
# XXX I think this should really be done in BaseHTTPServer or similar.
# See also <http://trac.edgewall.org/ticket/1183>.
#
# However, we'd still want to override do_POST so that we can
# sneak the client address to the _dispatch function in self.server.
class pipe_eating_rqh(SimpleXMLRPCServer.SimpleXMLRPCRequestHandler):
def pipewrap(self, f):
try:
f(self)
except socket.error, (code, msg):
if (code == errno.EPIPE or code == errno.ECONNRESET or
code == 10053): # 10053 for Windows
# self.log_message('Lost connection to client: %s',
# self.address_string())
logger.info('Lost connection to client: %s',
format_ipaddr(self.client_address))
else:
raise

def do_POST(self):
self.server.client_address = self.client_address
return self.pipewrap(
SimpleXMLRPCServer.SimpleXMLRPCRequestHandler.do_POST)
def report_404(self):
return self.pipewrap(
SimpleXMLRPCServer.SimpleXMLRPCRequestHandler.report_404)

class MgrServer(SocketServer.ThreadingMixIn,
SimpleXMLRPCServer.SimpleXMLRPCServer):
"""
The "Manager Server" adds a few things over a basic XML-RPC server:

- Uses the threading mix-in to run each rpc request in a
separate thread.
- Runs an (optional) periodic handler.
- Logs all requests with the logger.
- Handles "admin" requests specially.
- Logs "unexpected" exceptions, so that we catch server bugs
in the server log.
"""

def __init__(self, addr, periodic = None, *args, **kwargs):
SimpleXMLRPCServer.SimpleXMLRPCServer.__init__(self, addr,
requestHandler = pipe_eating_rqh,
*args, **kwargs)

# Note, can't just change self.funcs[] into a dict of
# tuples without overriding system_methodHelp() too,
# so we'll use a separate parallel dictionary.
self.admin_label = {}

self.periodic = periodic
if periodic:
self.socket.settimeout(periodic[0])
# see __nonzero__ below
self.register_function(self.__nonzero__)

def get_request(self):
while True:
try:
result = self.socket.accept()
except socket.timeout:
self.periodic[1](*self.periodic[2])
else:
return result
# not reached

def _dispatch(self, method, params):
# Taken from SimpleXMLRPCServer.py but then stripped down and
# modified.
if method in self.admin_label:
... stuff snipped out here ...
try:
func = self.funcs[method]
except KeyError:
# regular SimpleXMLRPCServer checks for self.instance
# and if so, for its _dispatch here ... we're not using
# that so I omit it
func = None
if func is not None:
logger.debug('%s: %s%s',
format_ipaddr(self.client_address), method, str(params))
try:
return func(*params)
except MgrError, e:
# Given, e.g., MgrError(ValueError('bad value'))),
# send the corresponding exc_type / exc_val back
# via xmlrpclib, which transforms it into a Fault.
raise e.exc_type, e.exc_val
except xmlrpclib.Fault:
# Already a Fault, pass it back unchanged.
raise
except TypeError, e:
# If the parameter count did not match, we will get
# a TypeError with the traceback ending with our own
# call at "func(*params)". We want to pass that back,
# rather than logging it.
#
# If the TypeError happened inside func() or one of
# its sub-functions, the traceback will continue beyond
# here, i.e., its tb_next will not be None.
if sys.exc_info()[2].tb_next is None:
raise
# else fall through to error-logging code
except:
pass # fall through to error-logging code

# Any other exception is assumed to be a bug in the server.
# Log a traceback for server debugging.
# is logger.error exc_info thread-safe? let's assume so
logger.error('internal failure in %s', method, exc_info = True)
# traceback.format_exc().rstrip()
raise xmlrpclib.Fault(2000, 'internal failure in ' + method)
else:
logger.info('%s: bad request: %s%s',
format_ipaddr(self.client_address), method, str(params))
raise Exception('method "%s" is not supported' % method)

# Tests of the form:
# c = new_class_object(params)
# if c: ...
# are turned into calls to the class's __nonzero__ method.
# We don't do "if server:" in our own server code, but if we did
# this would get called, and it's reasonable to just define it as
# True. Probably the existing SimpleXMLRPCServer (or one of its
# base classes) should have done this, but they did not.
#
# For whatever reason, the xml-rpc library routines also pass
# a client's __nonzero__ (on his server proxy connection) to us,
# which reaches our dispatcher above. By registering this in
# our __init__, clients can do "if server:" to see if their
# connection is up. It's a frill, I admit....
def __nonzero__(self):
return True

def register_admin_function(self, f, name = None):
... more stuff snipped out ...

# --END-- threading XML RPC server code
 
D

Dennis Lee Bieber

Thanks for the reply. No, I was just working with a normal library
class which was supposed to be derived. So that's what I did, but in
the process I found myself needing to create an instance variable and
it dawned on me: "how do I know I'm not clobbering something
here???" ... I'd have to look at the docs, right? But I still wasn't
sure ... so, then I thought "let's look at the source", and then I
found out. But! It took me some time to make sure, and I was puzzled
as well. I mean, what if I have no source to look at? What if the
library I'm using doesn't realase the source, or what if I just can't
get my hands on it for some reason or another?
Well... then use the double __ prefix so your instance variables
have a class specific name... That IS what the __ are designed to do --
ensure that the name, as used /in/ that class itself is unique, and not
referencing something from some unknown parent class.
 
S

Steven D'Aprano

Well... then use the double __ prefix so your instance variables
have a class specific name... That IS what the __ are designed to do --
ensure that the name, as used /in/ that class itself is unique, and not
referencing something from some unknown parent class.


Unfortunately name mangling doesn't *quite* do that. It is still possible
to have name clashes.

Imagine you inherit from a class Ham:


from luncheon_meats import Ham

class Spam(Ham):
def __init__(self):
self.__x = "yummy"


You think you're safe, because the attribute __x is mangled to _Spam__x.
But little do you know, Ham itself inherits from another class Meat,
which inherits from Food, which inherits from FoodLikeProducts, which
inherits from... Spam. Which also has an __x attribute.

You now have a name clash.
 
D

Dennis Lee Bieber

from luncheon_meats import Ham

class Spam(Ham):
def __init__(self):
self.__x = "yummy"


You think you're safe, because the attribute __x is mangled to _Spam__x.
But little do you know, Ham itself inherits from another class Meat,
which inherits from Food, which inherits from FoodLikeProducts, which
inherits from... Spam. Which also has an __x attribute.

You now have a name clash.

And you've just defined a circular inheritance tree -- I wouldn't
expect ANY language to handle such... Presuming it doesn't choke on the
parsing. Or a very confusing set of imports where you are
re-using/re-defining a class name itself.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top