Why less emphasis on private data?

T

time.swift

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Thanks any insight.
 
T

Thomas Ploch

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..
It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Thanks any insight.

Python doesn't prefer public data in classes. It leaves the choice to
the programmer. You can define your own private instance variables (or
functions) by using a '__' prefix:

example:
class Foo:
def __init__(self, data):
self.__data = data

def get_data(self):
return self.__data

Traceback (most recent call last):
'bar'
 
D

Diez B. Roggisch

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Private data is a convention, not a strict enforcement, for both Java
and C++.

Depending on your C++ compiler, a simple

#define private public

will give you access to all data you want. Besides the fact that casting
to a void* pointer and just accessing the private parts isn't rocket
science.

The same applies to java, for whatever reasons (I presume
serialization), you can access private fields via reflection.

In python, private members are usually declared using a single or double
underscore. And the basic idea is: "if you tamper with this, you've been
warned". Which is the way coding between consenting adults should be.

To be honest: I've stumbled over more cases of unescessary hoops to jump
through due to private declarations than bugs caused of me exploiting
things I've been told by the compiler not to tamper with it.

Summary: not important, forget about it, enjoy python!

Diez
 
S

skip

time> Coming from a C++ / C# background, the lack of emphasis on private
time> data seems weird to me.

Python doesn't try to protect you from the authors of the code you use. You
should be intelligent enough to use it wisely. On the flip side, the lack
of truly private data and methods means the original author of a piece of
code doesn't need to anticipate all future uses to which the code will be
put. Here are a couple items along the lines of "we're all adults here".

http://spyced.blogspot.com/2005/06/anders-heljsberg-doesnt-grok-python.html
http://www.mail-archive.com/[email protected]/msg17806.html

Skip
 
?

=?ISO-8859-1?Q?BJ=F6rn_Lindqvist?=

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Google for "python for consenting adults"

Or ask yourself the opposite question. Why does C++ and C# prefer more
private data? It is given that emphasizing private data
(encapsulation) leads to more internal complexity and more lines of
code because you have to write getters and setters and stuff. With
that in mind, why do you think that data encapsulation makes code less
error prone? Can you prove it? Or do you have anecdotal evidence of
where data encapsulation saved your ass?

IMHO, that data hiding is good, is one of those ideas that have been
repeated so much that virtually everyone thinks it is true. But
Python proves that it isn't necessarily so.
 
S

Stefan Schwarzer

Google for "python for consenting adults"

Or ask yourself the opposite question. Why does C++ and C# prefer more
private data? It is given that emphasizing private data
(encapsulation) leads to more internal complexity and more lines of
code because you have to write getters and setters and stuff. With
that in mind, why do you think that data encapsulation makes code less
error prone? Can you prove it? Or do you have anecdotal evidence of
where data encapsulation saved your ass?

IMHO, that data hiding is good, is one of those ideas that have been
repeated so much that virtually everyone thinks it is true. But
Python proves that it isn't necessarily so.

I think attributes (callable or not) which relate to the
abstraction of the class should be "public" (special methods
or without leading underscore). Attributes that are there for a
specific implementation of the abstraction should be "private".

The internal implementation of a class is more-often changed
in incompatible ways than the abstraction, so distiguishing
between a public and a private interface will probably save
you from reworking the clients of a class if you prefer the
public interface. It will also make the client code easier to
understand.

Admittedly, there are special cases where you want to access
private attributes, e. g. debugging; that's ok.

In summary, the distinction between public and non-public
attributes IMHO makes sense, but I don't think that the
distinction should be enforced by the language as in C++
or Java.

Stefan
 
P

Paul Rubin

BJörn Lindqvist said:
It is given that emphasizing private data (encapsulation) leads to
more internal complexity and more lines of code because you have to
write getters and setters and stuff.

You can have public variables in Java if you choose to. Writing
private variables with public setters and getters is just a style choice.
Or do you have anecdotal evidence of where data encapsulation saved
your ass?

There are certainly applications that can't live without it, like
browser applets.

As for it saving my ass, there's no way to know, it's like asking
whether garbage collection has saved my ass. Yes I've had plenty of
pointer related bugs in C programs that don't happen in GC'd
languages, so GC in that sense saves my ass all the time. I've also
had bugs in Python programs that would have been prevented by better
use of encapsulation (including in the stdlib). Python certainly
makes you spend more of your attention worrying about possible
attribute name collisions between classes and their superclasses. And
Python's name mangling scheme is leaky and bug-prone if you ever
re-use class names. Overall, I think Python would gain from having
better support for encapsulation and C++-like casting between class
instances.
 
D

Dennis Lee Bieber

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?
Python presumes the programmer knows what they want to do...

Common convention is to use a leading _ to signify attributes that
are "internal details, touch at your own risk".

__ (two leading underscores) results in name-mangling. This /may/ be
used to specify "private" data, but is really more useful when one is
designing with multiple super classes:

class X(A, B, C):
...

If, say, A and C both have an attribute "m" (or "_m", instances of X
would have only a single common "m"/"_m". Using "__m" in each of A and C
will result in instances of X having two attributes -- name mangled to
refer to the original parent class.
.... def __init__(self):
.... super(A, self).__init__()
.... self.m = "I'm Come From A"
.... self.__n = "I'm from A"
.... print "Init A, m is", self.m
.... .... def __init__(self):
.... super(B, self).__init__()
.... self.m = "I'm From B"
.... self.__n = "I come from B"
.... print "Init B, m is", self.m
.... .... def __init__(self):
.... super(X, self).__init__()
.... self.__n = "And I'm in X"
.... Init B, m is I'm From B
Init A, m is I'm Come From A['_A__n', '_B__n', '_X__n', '__class__', '__delattr__', '__dict__',
'__doc__', '__getattribute__', '__hash__', '__init__', '__module__',
'__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__',
'__str__', '__weakref__', 'm']
Note how there is only ONE "m", but three "__n"s (_A, _B, and _X
variants).


--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
A

Andrea Griffini

Paul said:
> Yes I've had plenty of
pointer related bugs in C programs that don't happen in GC'd
languages, so GC in that sense saves my ass all the time.

My experience is different, I never suffered a lot for
leaking or dangling pointers in C++ programs; and on
the opposite I didn't expect that fighting with object
leaking in complex python applications was that difficult
(I've heard of zope applications that just gave up and
resorted to the "reboot every now and then" solution).

With a GC if you just don't plan ownership and disposal
carefully and everything works as expected then you're
saving some thinking and code, but if something goes
wrong then you're totally busted.
The GC "leaky abstraction" requires you to be lucky to
work well, but unfortunately IMO as code complexity
increases one is never lucky enough.

Andrea
 
P

Paul Rubin

Dennis Lee Bieber said:
__ (two leading underscores) results in name-mangling. This /may/ be
used to specify "private" data, but is really more useful when one is
designing with multiple super classes:

Trouble with this is you can have two classes with the same name,
perhaps because they were defined in different modules, and then the
name mangling fails to tell them apart.
 
B

Ben Artin

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Thanks any insight.

One thing that the other posters didn't mention is that if you access data
members of a class in C++ you end up with a very tight coupling with that class.
If the class later changes so that the data is no longer part of the public
interface, then every user of the class has to change the code and recompile.

In Python, on the other hand, if I have a piece of public data that I later
decide to replace with an accessor method, I can do that without changing any of
the code that uses the class.

So, insistence on private data in C++ is a good thing because it reduces the
level of coupling between a class and its clients. In Python, this is not an
issue, because the same loose coupling can be obtained with data as well as
accessor methods, and therefore public data is used when possible and private
data when necessary.

hth

Ben
 
F

Felipe Almeida Lessa

Trouble with this is you can have two classes with the same name,
perhaps because they were defined in different modules, and then the
name mangling fails to tell them apart.

What is the chance of having to inherit from two classes from
different modules but with exactly the same name *and* the same
instance variable name?

Of course you're being very pessimistic or extremely unlucky.
 
P

Paul Rubin

Felipe Almeida Lessa said:
What is the chance of having to inherit from two classes from
different modules but with exactly the same name *and* the same
instance variable name?

Of course you're being very pessimistic or extremely unlucky.

If you want to write bug-free code, pessimism is the name of the game.
 
R

robert

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

What is the use of private declarations, if the names themselves are not verbose about it?

=> You'll always search the class definition/doc to check if the member is below "private" or you wait for compiler errors. If you still want to override, you have to declare 'friends' and all that school boy stuff.

=> Its not useful and efficient for programmers but probably more fulfilled teachers lust itching disciples, when those languages where invented.

Moreover, in those languages there is more or less a clash of namespaces: All globals, module globals, members, local variables and possibly 'with'-variables. This confusion mixed with private declarations will soon provide a situation where one looses overview, what variable exactly was meant.

The syntax in Python with _'s and 'self.' and true modularization and minimal magic namespace behavior, but with explicit self-similiar access to objects, modules, functions and everything is overall most clear und effective. After all I don't know another language which behaves so well in this regard.

Even Ruby (little positive: it has not even the 'global' variable declaration) is much more ill below the line in that modules,classes, methods/functions.. are not objects but namespaces, messages etc. - thus self-similarity is so broken, that this which will actually limit the power and scalability of this language.


Robert
 
B

bearophileHUGS

Paul Rubin:
Python certainly makes you spend more of your attention worrying
about possible attribute name collisions between classes and their
superclasses. And Python's name mangling scheme is leaky and
bug-prone if you ever re-use class names.
Trouble with this is you can have two classes with the same name,
perhaps because they were defined in different modules, and then the
name mangling fails to tell them apart.

Without changing Python syntax at all I think this situation may be
improved. Instead of Python applying name mangling to names with __
before them, it can manage them as private, a higher level kind of
management. And then if it's useful a new built-in function may be
invented to access such private attributes anyway. I think this may
solve your problem. (This is for Py3.0). Maybe a metaclass can be
invented to simulate such behavior to test and try it before modifying
the language itself.

Bye,
bearophile
 
J

Jorgen Grahn

You can have public variables in Java if you choose to. Writing
private variables with public setters and getters is just a style choice.

Privates with getters/setters are (as I think someone else hinted) pretty
pointless. The interesting stuff is the private data that *is* private, i.e.
not meant for users at all.

But yes, I don't mind not having 'private:' in Python. I don't have
compile-time type checking anyway. In fact, I don't always know what the
attributes of my objects /are/ until runtime.

And besides, this is pretty close to a compile-time check:

find -name \*.py | \
xargs egrep '\._[_a-z]' | \
fgrep -v self._

/Jorgen
 
T

Thomas Ploch

Jorgen said:
Privates with getters/setters are (as I think someone else hinted) pretty
pointless. The interesting stuff is the private data that *is* private, i.e.
not meant for users at all.

Not really pointless, since you can hide your data structures that you
don't want to be fiddled around with (which for me is almost the only
point to use it).
But yes, I don't mind not having 'private:' in Python. I don't have
compile-time type checking anyway. In fact, I don't always know what the
attributes of my objects /are/ until runtime.

Me neither, although I have to say that the '__' prefix comes pretty
close to being 'private' already. It depends on the definition of
private. For me, private means 'not accessible from outside the
module/class'.

Thomas
 
P

Paul Rubin

Thomas Ploch said:
Me neither, although I have to say that the '__' prefix comes pretty
close to being 'private' already. It depends on the definition of
private. For me, private means 'not accessible from outside the
module/class'.

class A:
__x = 3

class B(A):
__x = 4 # ok

class C(B):
__x = 5 # oops!

Consider that the above three class definitions might be in separate
files and you see how clumsy this gets.
 
T

Thomas Ploch

class A:
__x = 3

class B(A):
__x = 4 # ok

class C(B):
__x = 5 # oops!

Consider that the above three class definitions might be in separate
files and you see how clumsy this gets.


I don't understand why this should be oops, even if they are in
different files.
>>> a = A()
>>> print a._A__x 3
>>> b = B()
>>> print b._B__x 4
>>> c = C()
>>> print c._C__x 5
>>> dir(c) ['_A__x', '_B__x', '_C__x', '__doc__', '__module__']
>>> print c._A__x 3
>>> print c._B__x
4
 
S

sturlamolden

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

The designers of Java, C++, C#, Ada95, Delphi, etc. seem to think that
if an object's 'internal' variables or states cannot be kept private,
programmers get an irresistible temptation to mess with them in
malicious ways. But if you are that stupid, should you be programming
in any language? The most widely used language is still C, and there is
no concept of private data in C either, nor is it needed.

As mentioned in other replies, it is not rocket science to access a
class private data. In C++ you can cast to void*, in Java and C# you
can use reflection. C++ is said to be an "unsafe" language because
programmers can, using a few tricks, mess with the vtables. But how
many really do that?

In Python variables are kept in strict namespaces. You can ask the
compiler to name mangle a variable by prepending underscores. The
variable then becomes just as 'private' as a C++ private variable,
because as previously mentioned, 'private' variables in C++ can be
accessed through a cast to void*.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top