Why less emphasis on private data?

P

Paul Boddie

Paul said:
class A:
__x = 3

class B(A):
__x = 4 # ok

class C(B):
__x = 5 # oops!

Consider that the above three class definitions might be in separate
files and you see how clumsy this gets.

What are you trying to show with the above? The principal benefit of
using private attributes set on either the class or the instance is to
preserve access, via self, to those attributes defined in association
with (or within) a particular class in the inheritance hierarchy, as
opposed to providing access to the "most overriding" definition of an
attribute. This is demonstrated more effectively with a method on class
A:

class A:
__x = 3
def f(self):
print self.__x # should always refer to A.__x

class B(A):
__x = 4

class C(B):
__x = 5

Here, instances of A, B and C will always print the value of A.__x when
the f method is invoked on them. Were a non-private attribute to be
used instead, instances of A, B and C would print the overridden value
of the attribute when the f method is invoked on them.

Paul
 
T

Thomas Ploch

sturlamolden said:
The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways. But if you are that stupid, should you be programming
in any language? The most widely used language is still C, and there is
no concept of private data in C either, nor is it needed.

There is a kind of this concept in C with 'static' declarations.
As mentioned in other replies, it is not rocket science to access a
class private data. In C++ you can cast to void*, in Java and C# you
can use reflection. C++ is said to be an "unsafe" language because
programmers can, using a few tricks, mess with the vtables. But how
many really do that?

Exactly, if they were available, a lot more would do that. I think this
is the point. Programmers who can do that normally are sensible towards
that people who have designed this or that knew what they were doing.
But there are enough people that don't have a clue and _will_ fiddle
around and then flame all kind of mailing lists with requests for help
cause they did it wrong.
 
B

Bruno Desthuilliers

Andrea Griffini a écrit :
My experience is different, I never suffered a lot for
leaking or dangling pointers in C++ programs; and on
the opposite I didn't expect that fighting with object
leaking in complex python applications was that difficult
(I've heard of zope applications that just gave up and
resorted to the "reboot every now and then" solution).
Zope is a special case here, since it relies on an object database...
 
B

Bruno Desthuilliers

Paul Rubin a écrit :
If you want to write bug-free code, pessimism is the name of the game.

Not to pretend my own code is always totally bug-free, but I found that,
with languages like Python, I usually got better results starting with
the simplest possible implementation, and only then adding some
'defensive' boilerplate where it makes sens (that is mostly resources
acquisition/release) - an approach that I would certainly not advocate
when it comes to coding in C...
 
S

Sebastian 'lunar' Wiesner

[ Thomas Ploch said:
sturlamolden said:
The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think
that if an object's 'internal' variables or states cannot be kept
private, programmers get an irresistible temptation to mess with them
in malicious ways. But if you are that stupid, should you be
programming in any language? The most widely used language is still
C, and there is no concept of private data in C either, nor is it
needed.

There is a kind of this concept in C with 'static' declarations.
As mentioned in other replies, it is not rocket science to access a
class private data. In C++ you can cast to void*, in Java and C# you
can use reflection. C++ is said to be an "unsafe" language because
programmers can, using a few tricks, mess with the vtables. But how
many really do that?

Exactly, if they were available, a lot more would do that. I think
this is the point. Programmers who can do that normally are sensible
towards that people who have designed this or that knew what they were
doing. But there are enough people that don't have a clue and _will_
fiddle around and then flame all kind of mailing lists with requests
for help cause they did it wrong.

Those people deserve to fail for being just extraordinary stupid...
 
T

Thomas Ploch

Sebastian said:
Those people deserve to fail for being just extraordinary stupid...

Yes, but there are a lot of them around...

Thomas


P.S.: I don't mean they are around here. :)
 
D

Dennis Lee Bieber

Trouble with this is you can have two classes with the same name,
perhaps because they were defined in different modules, and then the
name mangling fails to tell them apart.

If I encounter something like

import mod1
import mod2
class A(mod1.B, mod2.B):
....

I'd be quite concerned about the design environment rather than the
immediate code... Probably need something ugly like...

from mod1 import B as B1
from mod2 import B as B2
class A(B1, B2):
....

{note: I've not pulled up an interpreter to test the above}

--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
J

John Nagle

sturlamolden said:
The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways.

If you're not clear on encapsulation issues, you probably haven't
done extensive maintenance programming on code written by others.
Finding out who can mess with a variable when debugging the code of
others is not fun.

Because Python doesn't have explicit declarations, scope of variables is
a touchy issue. If you write "x = 1" within a function, that will
create a local "x" if "x" doesn't exist, or alter a global "x" if "x" was
previously created in the global context. But at least global variables
are local to the namespace; we don't have clashes across files. So
it's not too bad. JavaScript has the convention that newly created
variables are global by default. Big mistake.

The underscore thing makes sense. Single underscore
variables are "protected" in the C++ sense, and double underscore
variables are "private", not visible from inherited classes.
It's hard to misuse such variables by accident. I'd be tempted
to prohibit access to underscore variables other than via "self._x"
forms, so they'd be inaccessable outside the object. It's undesirable
from a maintenance standpoint to have an unenforced convention like
a lead underscore. The maintenance programmer can't trust its meaning.

As Python grows up, and larger systems are written in it, these
issues become more important.

John Nagle
Animats
 
A

Andrea Griffini

Bruno said:
Zope is a special case here, since it relies on an object database...

Just to clarify my post... I found by being punched myself
in the nose what does it mean to have a complex python
application that suffers from object leaking; it's not
something I only read about zope programs.

But why zope applications would be a special case ?

Andrea
 
G

Gabriel Genellina

Because Python doesn't have explicit declarations, scope of variables is
a touchy issue. If you write "x = 1" within a function, that will
create a local "x" if "x" doesn't exist, or alter a global "x" if "x" was
previously created in the global context. But at least global variables
are local to the namespace; we don't have clashes across files.
No, `x=1` always uses a local variable x, unless an (explicit!) global
statement was in effect in the same block. This, and the explicit self,
make very clear which x you are referring to.
 
B

Bruno Desthuilliers

Thomas Ploch a écrit :
sturlamolden schrieb: (snip)

Exactly, if they were available, a lot more would do that.

Do you have any concrete evidence ? FWIW, I've seen a *lot* of Python
code, and very very few uses of _implementation stuff - most of them
being legitimate.
I think this
is the point. Programmers who can do that normally are sensible towards
that people who have designed this or that knew what they were doing.
But there are enough people that don't have a clue and _will_ fiddle
around and then flame all kind of mailing lists with requests for help
cause they did it wrong.

The fact is that there's no cure for stupidity. If you want a language
explicitly designed to "protect" dummies from themselves, you know where
to find it. Why should normally intelligent peoples have to suffer from
this ? Are you going to forbid hammers because dummies could smash their
fingers then complain ?
 
B

Bruno Desthuilliers

Andrea Griffini a écrit :
Just to clarify my post... I found by being punched myself
in the nose what does it mean to have a complex python
application that suffers from object leaking; it's not
something I only read about zope programs.

But why zope applications would be a special case ?

1/ because of how Zope and the ZODB work
2/ because Zope is an unusually complex Python application.

FWIW, I've never had any memory problem with other Python applications
and/or frameworks I've used so far (ie: in the past seven years) - most
of them being somewhat 'simpler' than if they had been implemented in C
or C++...
 
B

Bruno Desthuilliers

John Nagle a écrit :
If you're not clear on encapsulation issues,

encapsulation != data hiding
you probably haven't
done extensive maintenance programming on code written by others.

I did.
Finding out who can mess with a variable when debugging the code of
others is not fun.

# before
class Toto(object):
def __init__(self, x):
self._x = x

# after
class Toto(object):
def __init__(self, x):
self._x = x

@apply
def _x():
def fget(self):
return self._real_x
def fset(self, value):
import pdb; pdb.set_trace()
self._real_x = value
return property(**locals)

This is of course a braindead implementation - a better one would use
either the inspect module of the sys._getframe() hack to retrieve useful
debug infos (left as an excercice to the reader...)

Because Python doesn't have explicit declarations, scope of
variables is
a touchy issue.
???

If you write "x = 1" within a function, that will
create a local "x" if "x" doesn't exist, or alter a global "x" if "x" was
previously created in the global context.

Err... May I suggest you to read these two pages:
http://docs.python.org/ref/assignment.html
http://docs.python.org/ref/global.html#l2h-563
But at least global variables
are local to the namespace; we don't have clashes across files. So
it's not too bad. JavaScript has the convention that newly created
variables are global by default.

Unless preceded by the 'var' keyword...
Big mistake.

Mmm... which one ?
The underscore thing makes sense. Single underscore
variables are "protected" in the C++ sense, and double underscore
variables are "private", not visible from inherited classes.
It's hard to misuse such variables by accident. I'd be tempted
to prohibit access to underscore variables other than via "self._x"
forms, so they'd be inaccessable outside the object.

# foo.py
class Foo(object):
def __init__(self, x):
self._x = x
def __repr__(self):
return "<Foo %s>" % self._x

# bar.py
def bar(self):
self.y = self._x

# baaz.py
from foo import Foo
from bar import bar
Foo.bar = bar

f = Foo([42])
f.bar()
f.y.append('gotcha')
print f

It's undesirable
from a maintenance standpoint to have an unenforced convention

If it's a convention, it doesn't have to be inforced. If it's inforced,
it's not a convention anymore.

While we're at it, I've found it very valuable to be able to mess with
implementation when doing maintenance on somewhat large (and somewhat
messy) Python systems...
like
a lead underscore. The maintenance programmer can't trust its meaning.
>
As Python grows up, and larger systems are written in it, these
issues become more important.

If you go that way, then you'll also want to introduce declarative
static typing and remove all possibility to dynamically modify classes
or add/replace attributes and/or methods on a per-instance basis. If you
want Java, you know where to find it.
 
S

Steven D'Aprano

If you want to write bug-free code, pessimism is the name of the game.

I wonder whether Paul uses snow chains all year round, even in the blazing
summer? After all, "if you want to drive safely, pessimism is the name of
the game".

In the last couple of weeks comp.lang.python has had (at least) two
practical examples of the pros and cons of private attributes.

The pro: there was discussion about replacing the optparse module's
implementation with argparse, leaving the interface the same. This was
complicated by the fact that optparse exposes its internal variables,
making the job of duplicating the interface significantly harder. However
this was surely a design choice, not an accident. Having private
attributes won't save you if you choose not to make your attributes
private.

The con: there was a fellow who (for some reason) actually needed to
access a class' private attributes. To the best of my knowledge, he was
over 18 and, while new to Python, an experienced programmer, so I believe
him when he said he had eliminated all other alternatives. (And if he
were wrong, if he was incompetent -- Not My Problem. It isn't for me to
take a hammer off him so he doesn't hit his thumb with it.) In his case,
Python's name mangling of private attributes was an inconvenience, not a
help.

Compared to all the "what-ifs" and "maybes" and hypotheticals in this
thread, there were two practical cases that revolved around private
variables. In one, we see that they aren't a panacea: data hiding doesn't
help when the data isn't hidden. In the other, we see that one developer's
private attribute is just what another developer needs to solve a problem.
 
P

Paul Rubin

Paul Boddie said:
What are you trying to show with the above? The principal benefit of
using private attributes set on either the class or the instance is to
preserve access, via self, to those attributes defined in association
with (or within) a particular class in the inheritance hierarchy, as
opposed to providing access to the "most overriding" definition of an
attribute. This is demonstrated more effectively with a method on class A:

Right, the problem is if those methods start changing the "private"
variable. I should have been more explicit about that.

class A:
def __init__(self):
self.__x = 3
def foo(self):
return self.__x

class B(A): pass

class A(B):
def bar(self):
self.__x = 5 # clobbers private variable of earlier class named A
 
P

Paul Rubin

Steven D'Aprano said:
I wonder whether Paul uses snow chains all year round, even in the blazing
summer? After all, "if you want to drive safely, pessimism is the name of
the game".

No. I'm willing to accept a 10**-5 chance of hitting a freak
snowstorm in summer, since I drive in summer at most a few hundred
times a year, so it will take me 100's of years before I'm likely to
encounter such a storm. There are millions of drivers, so if they all
take a similar chance, then a few times a year we'll see in the paper
that someone got caught in a storm, which is ok. Usually there's no
real consequence beyond some inconvenience of waiting for a tow truck.

Tow truck or ambulance operators, on the other hand, should keep
chains available all year around, since they have to service the needs
of millions of users, have to be ready for freak summer storms.

As a software developer wanting to deploy code on a wide scale, I'm
more like a tow truck operator than an individual car driver.
Alternatively, as a coder I "drive" a lot more often. If some Python
misfeature introduces a bug with probability 10**-5 per line of code,
then a 100 KLoc program is likely to have such a bug somewhere. It
doesn't take 100's of years.
 
P

Paul Rubin

Dennis Lee Bieber said:
I'd be quite concerned about the design environment rather than the
immediate code... Probably need something ugly like...

from mod1 import B as B1
from mod2 import B as B2
class A(B1, B2):
....

Interesting. I just tried that. mod1.py contains:

class B:
def foo(self): self.__x = 'mod1'

mod2.py contains:

class B:
def bar(self): self.__x = 'mod2'

And the test is:

from mod1 import B as B1
from mod2 import B as B2

class A(B1, B2): pass

a = A()
a.foo()
print a._B__x
a.bar()
print a._B__x

Sure enough, mod2 messes up mod1's private variable.
 
S

Steven D'Aprano

No. I'm willing to accept a 10**-5 chance of hitting a freak
snowstorm in summer, since I drive in summer at most a few hundred
times a year, so it will take me 100's of years before I'm likely to
encounter such a storm. There are millions of drivers, so if they all
take a similar chance, then a few times a year we'll see in the paper
that someone got caught in a storm, which is ok. Usually there's no
real consequence beyond some inconvenience of waiting for a tow truck.

Tow truck or ambulance operators, on the other hand, should keep
chains available all year around, since they have to service the needs
of millions of users, have to be ready for freak summer storms.

As a software developer wanting to deploy code on a wide scale, I'm
more like a tow truck operator than an individual car driver.
Alternatively, as a coder I "drive" a lot more often. If some Python
misfeature introduces a bug with probability 10**-5 per line of code,
then a 100 KLoc program is likely to have such a bug somewhere. It
doesn't take 100's of years.

That's an irrelevant argument. We're not talking about random bugs in
random places of code, we're talking about one specific type of bug which
can only occur in a handful of very restricted set of circumstances, e.g.
you have to inherit from two classes which not only have exactly the same
name but they also have the same private attribute.

Your argument is that Python's strategy for dealing with private
attributes is insufficiently pessimistic, because it doesn't deal with
those circumstances. Fine. I agree. Python isn't pessimistic. Does it need
to be? Just how often do you inherit from two identically-named classes
both of which use identically-named private attributes?

You suggested that coders (and by extension, Python) should behave with
equal pessimism whether they are subclassing two identically-named
classes or not. That's equivalent to the argument that one should use snow
chains whether it is snowing or not -- it only considers the benefit of
the extra protection, without considering the costs.

Python's private attribute handling balances convenience and protection,
giving more weight to convenience, trading off some protection. And
convenience gives increased productivity, easier debugging, few bugs
overall, and other Good Things. It would probably change the character of
Python unacceptably much to push that balance the other way.

Don't get me wrong, it is a good thing for you to alert people to the
circumstances that Python's strategy breaks down, so that they can "use
snow chains" in those circumstances. And, hey, if somebody reads this
thread and is motivated to find a better strategy that doesn't change the
nature of the language by too much, great. (This happened once before:
multiple inheritance was broken in classic classes, and new classes where
added partly to fix that.)

But chances are, the majority of Pythonistas will think that having to
use snow chains once in a very great while is an acceptable trade-off to
the smooth ride Python gives the rest of the time.
 
H

hg

sturlamolden said:
The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways. But if you are that stupid, should you be programming
in any language? The most widely used language is still C, and there is
no concept of private data in C either, nor is it needed.


void test(void)
{
static int i;
}


Do you agree that i is "private" to test ?

hg
 
P

Paul Rubin

Steven D'Aprano said:
Just how often do you inherit from two identically-named classes
both of which use identically-named private attributes?

I have no idea how often if ever. I inherit from library classes all
the time, without trying to examine what superclasses they use. If my
subclass happens to have the same name as a superclass of some library
class (say Tkinter) this could happen. Whether it ever DOES happen, I
don't know, I could only find out by examining the implementation
details of every library class I ever use, and I could only prevent it
by remembering those details. That is an abstraction leak and is
dangerous and unnecessary. The name mangling scheme is a crock. How
often does anyone ever have a good reason for using it, except maybe
in something like a debugger that can just as easily reach inside the
actual class descriptors and get all the variables out?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,068
Latest member
MakersCBDIngredients

Latest Threads

Top