Why less emphasis on private data?

sturlamolden · Jan 9, 2007

I let the user change the internal state of the engine, I have no
assurances that my product (the engine) is doing its job...

How would you proceed to protect this inner states? In C++ private
members they can be accessed through a cast to void pointer. In Java it
can be done through introspection. In C# it can be done through
introspection or casting to void pointer in an 'unsafe' block. There is
no way you can protect inner states of an object, it is just an
illusion you may have.

Python have properties as well. Properties has nothing to do with
hiding attributes.

Steven D'Aprano · Jan 9, 2007

Really?

Strong words.

If you don't understand you need merely ask, so let me elucidate:

If there is some small chance of something occurring at run time that can
cause code to fail - a "low probability" in all the accepted senses of the
word - and a programmer declaims - "There is such a low probability of
that occurring and its so difficult to cater for that I won't bother"
- then am I supposed to congratulate him on his wisdom and outstanding
common sense?

Hardly. - If anything can go wrong, it will. - to paraphrase Murphy's law.

To illustrate:
If there is one place in any piece of code that is critical and not protected,
even if its in a relatively rarely called routine, then because of the high
speed of operations, and the fact that time is essentially infinite,

Time is essentially infinite? Do you really expect your code will still be
in use fifty years from now, let alone a billion years?

I know flowcharts have fallen out of favour in IT, and rightly so -- they
don't model modern programming techniques very well, simply because modern
programming techniques would lead to a chart far too big to be practical.
But for the sake of the exercise, imagine a simplified flowchart of some
program, one with a mere five components, such that one could take any of
the following paths through the program:

START -> A -> B -> C -> D -> E
START -> A -> C -> B -> D -> E
START -> A -> C -> D -> B -> E
....
START -> E -> D -> C -> B -> A

There are 5! (five factorial) = 120 possible paths through the program.

Now imagine one where there are just fifty components, still quite a
small program, giving 50! = 3e64 possible paths. Now suppose that there is
a bug that results from following just one of those paths. That would
match your description of "lowest probability" -- any lower and it would
be zero.

If all of the paths are equally likely to be taken, and the program takes
a billion different paths each millisecond, on average it would take about
1.5e55 milliseconds to hit the bug -- or about 5e44 YEARS of continual
usage. If every person on Earth did nothing but run this program 24/7, it
would still take on average almost sixty million billion billion billion
years to discover the bug.

But of course in reality some paths are more likely than others. If the
bug happens to exist in a path that is executed often, or if it exists
in many paths, then the bug will be found quickly. On the other hand, if
the bug is in a path that is rarely executed, your buggy program may be
more reliable than the hardware you run it on. (Cynics may say that isn't
hard.)

You're project manager for the development team. Your lead developer tells
you that he knows this bug exists (never mind how, he's very clever) and
that the probability of reaching that bug in use is about 3e-64.

If it were easy to fix, the developer wouldn't even have mentioned it.
This is a really hard bug to fix, it's going to require some major
changes to the program, maybe even a complete re-think of the program.
Removing this bug could even introduce dozens, hundreds of new bugs.

So okay Mister Project Manager. What do you do? Do you sack the developer,
like you said? How many dozens or hundreds of man-hours are you prepared
to put into this? If the money is coming out of your pocket, how much are
you willing to spend to fix this bug?

[snip]

How is this a misunderstanding of probability? - probability applies to
any one trial, so in a series of trials, when the number of trials is
large enough - in the
order of the inverse of the probability, then ones expectation must be
that the rare occurrence should occur...

"Even the lowest probability is a certainty" is mathematically nonsense:
it just isn't true -- no matter how many iterations, the probability is
always a little less than one. And you paper over a hole in your argument
with "when the number of trials is large enough" -- if the probability is
small enough, "large enough" could be unimaginably huge indeed.

Or, to put it another way, while anything with a non-zero probability
_might_ happen (you might drop a can of soft drink on your computer,
shorting it out and _just by chance_ causing it to fire off a perfectly
formatted email containing a poem about penguins) we are justified in
writing off small enough probabilities as negligible. It's not that they
can't happen, but the chances of doing so are so small that we can rightly
expect to never see them happen.

You might like to read up on Borel's "Law" (not really a law at all,
really just a heuristic for judging when probabilities are negligible).
Avoid the nonsense written about Borel and his guideline by Young Earth
Creationists, they have given him an undeserved bad name.

http://www.talkorigins.org/faqs/abioprob/borelfaq.html

There is a very low probability that any one gas molecule will collide
with any other one in a container

Not so. There is a very low probability that one gas molecule will collide
with a _specific_ other molecule -- but the probability of colliding with
_any_ other molecule is very high.

- and "Surprise! Surprise! " there
is nevertheless something like the mean free path...

Yes. And that mean free path increases without limit as the volume of the
gas increases. Take your molecule into the space between stars, and the
mean free path might be dozens of lightyears -- even though there is
actually more gas in total than in the entire Earth.

Now how does all this show a shocking lack of common sense?

You pay no attention to the economics of programming. Programming doesn't
come for free. It is always a trade-off for the best result with the least
effort. Any time people start making absolute claims about fixing every
possible bug, no matter how obscure or unlikely or how much work it will
take, I know that they aren't paying for the work to be done.

Hendrik van Rooyen · Jan 10, 2007

Steven D'Aprano said:
Time is essentially infinite? Do you really expect your code will still be
in use fifty years from now, let alone a billion years?

My code does not suffer from bit rot, so it should outlast the hardware...

But seriously - for the sort of mistakes we make as programmers - it does
not actually need infinite time for the lightning to strike - most things that
will actually run overnight are probably stable - and if it takes say a week
of running for the bug to raise its head - it is normally a very difficult
problem to find and fix. A case in point - One of my first postings to
this newsgroup concerned an intermittent failure on a serial port - It was
never resolved in a satisfactory manner - eventually I followed my gut
feel, made some changes, and it seems to have gone away - but I expect
it to bite me anytime - I don't actually *know* that its fixed, and there is
not, as a corollary to your sum below here, any real way to know for
certain.

I know flowcharts have fallen out of favour in IT, and rightly so -- they
don't model modern programming techniques very well, simply because modern
programming techniques would lead to a chart far too big to be practical.

I actually like drawing data flow diagrams, even if they are sketchy, primitive
ones, to try to model the inter process communications (where a "process"
may be just a python thread) - I find it useful to keep an overall perspective.

But for the sake of the exercise, imagine a simplified flowchart of some
program, one with a mere five components, such that one could take any of
the following paths through the program:

START -> A -> B -> C -> D -> E
START -> A -> C -> B -> D -> E
START -> A -> C -> D -> B -> E
...
START -> E -> D -> C -> B -> A

There are 5! (five factorial) = 120 possible paths through the program.

Now imagine one where there are just fifty components, still quite a
small program, giving 50! = 3e64 possible paths. Now suppose that there is
a bug that results from following just one of those paths. That would
match your description of "lowest probability" -- any lower and it would
be zero.

If all of the paths are equally likely to be taken, and the program takes
a billion different paths each millisecond, on average it would take about
1.5e55 milliseconds to hit the bug -- or about 5e44 YEARS of continual
usage. If every person on Earth did nothing but run this program 24/7, it
would still take on average almost sixty million billion billion billion
years to discover the bug.

In something with just 50 components it is, I believe, better to try to
inspect the quality in, than to hope that random testing will show up
errors - But I suppose this is all about design, and about avoiding
doing known no - nos.

But of course in reality some paths are more likely than others. If the
bug happens to exist in a path that is executed often, or if it exists
in many paths, then the bug will be found quickly. On the other hand, if
the bug is in a path that is rarely executed, your buggy program may be
more reliable than the hardware you run it on. (Cynics may say that isn't
hard.)

Oh I am of the opposite conviction - Like the fellow of the Circuit Cellar
I forget his name ( Steve Circia (?) ) who said: "My favourite Programming
Language is Solder"... I find that when I start blaming the hardware
for something that is going wrong, I am seldom right...

And this is true also for hardware that we make ourselves, that one would
expect to be buggy, because it is new and untested. It is almost as if the
tools used in hardware design are somehow less buggy than a programmer's
fumbling attempts at producing something logical.

You're project manager for the development team. Your lead developer tells
you that he knows this bug exists (never mind how, he's very clever) and
that the probability of reaching that bug in use is about 3e-64.

This is too convenient - This lead developer is about as likely as
my infinite time...

If it were easy to fix, the developer wouldn't even have mentioned it.
This is a really hard bug to fix, it's going to require some major
changes to the program, maybe even a complete re-think of the program.
Removing this bug could even introduce dozens, hundreds of new bugs.

So okay Mister Project Manager. What do you do? Do you sack the developer,
like you said? How many dozens or hundreds of man-hours are you prepared
to put into this? If the money is coming out of your pocket, how much are
you willing to spend to fix this bug?

Do a design review, Put in a man with some experience,
and hope for the best - in reality what else can you do, short
of trying to do it all yourself?

[snip]

How is this a misunderstanding of probability? - probability applies to
any one trial, so in a series of trials, when the number of trials is
large enough - in the
order of the inverse of the probability, then ones expectation must be
that the rare occurrence should occur...

Click to expand...

"Even the lowest probability is a certainty" is mathematically nonsense:
it just isn't true -- no matter how many iterations, the probability is
always a little less than one. And you paper over a hole in your argument
with "when the number of trials is large enough" -- if the probability is
small enough, "large enough" could be unimaginably huge indeed.

*grin* sure - this is not the maths tripos...

But I am willing to lay a bet, that over an evening's play at roulette, the
red will come up at least once. I would expect to win too.

Or, to put it another way, while anything with a non-zero probability
_might_ happen (you might drop a can of soft drink on your computer,
shorting it out and _just by chance_ causing it to fire off a perfectly
formatted email containing a poem about penguins) we are justified in
writing off small enough probabilities as negligible. It's not that they
can't happen, but the chances of doing so are so small that we can rightly
expect to never see them happen.

I promise I won't hold my breath...

<joke>
Man inspecting the work of a bunch of monkeys with Keyboards:

"Hey Harry - I think we might have something here - check this:

To be, or not to be, that is the iuuiihiuweriopuqewt"

You might like to read up on Borel's "Law" (not really a law at all,
really just a heuristic for judging when probabilities are negligible).
Avoid the nonsense written about Borel and his guideline by Young Earth
Creationists, they have given him an undeserved bad name.

http://www.talkorigins.org/faqs/abioprob/borelfaq.html

ok will have a look later

8<--------------

You pay no attention to the economics of programming. Programming doesn't
come for free. It is always a trade-off for the best result with the least
effort. Any time people start making absolute claims about fixing every
possible bug, no matter how obscure or unlikely or how much work it will
take, I know that they aren't paying for the work to be done.

Too much assumption from too little data. Have actually been the part owner
of a small company for the last two decades or so - I am paying, all right,
I am paying, and paying...

Which maybe is why I want perfection...

- Hendrik

Gabriel Genellina · Jan 10, 2007

At said:
Oh I am of the opposite conviction - Like the fellow of the Circuit Cellar
I forget his name ( Steve Circia (?) ) who said: "My favourite Programming
Language is Solder"..

Almost right: Steve Ciarcia.

--
Gabriel Genellina
Softlab SRL

__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Steve Holden · Feb 5, 2007

Paul said:
Has this ever been reported as a bug in Python? I could imagine more
sophisticated "name mangling": something to do with the identity of the
class might be sufficient, although that would make the tolerated
"subversive" access to private attributes rather difficult.

Paul

It would also force the mangling to take place at run-time, which would
probably affect efficiently pretty adversely (thinks: should really
check that mangling is a static mechanism before posting this).

regards
Steve
--
Steve Holden +44 150 684 7255 +1 800 494 3119
Holden Web LLC/Ltd http://www.holdenweb.com
Skype: holdenweb http://del.icio.us/steve.holden
Blog of Note: http://holdenweb.blogspot.com
See you at PyCon? http://us.pycon.org/TX2007

Bart Ogryczak · Feb 5, 2007

Coming from a C++ / C# background, the lack of emphasis on private data
seems weird to me. I've often found wrapping private data useful to
prevent bugs and enforce error checking..

It appears to me (perhaps wrongly) that Python prefers to leave class
data public. What is the logic behind that choice?

Often it´s a question of efficiency. Function calls in Python are
bloody slow. There is no "inline" directive, since it´s intepreted,
not compiled. Eg. consider code like that:

class MyWhatever:
...
def getSomeAttr(self):
return self._someAttr
def getSomeOtherAttr(self):
return self._someOtherAttr

[x.getSomeAttr() for x in listOfMyWhatevers if x.getSomeOtherAttr() ==
'whatever']

You´d get it running hundreds times faster doing it the "wrong" way:

[x._someAttr for x in listOfMyWhatevers if x._someOtherAttr ==
'whatever']

Paul Rubin · Feb 5, 2007

Steve Holden said:
It would also force the mangling to take place at run-time, which
would probably affect efficiently pretty adversely (thinks: should
really check that mangling is a static mechanism before posting this).

I think it could still be done statically. For example, the mangling
could include a random number created at compile time when the class
definition is compiled, that would also get stored in the class object.

I guess there are other ways to create classes than class statements
and those would have to be addressed too.

Does altering a private member decouple the property's value?	8	Jun 19, 2007
The devolution of English language and slothful c.l.p behaviors exposed!	50	Jan 24, 2012
Private visibility should be removed from Ruby 2 [was: Caveats with #method_missing]	24	Oct 2, 2006
ANN: Version 0.1.2 of sarge (a subprocess wrapper library) has beenreleased.	0	Dec 17, 2013
Ruby install on Kubuntu Linux - why so spread out	18	Jan 3, 2008
On sandboxes, and why you should care	0	Mar 31, 2006
What data type would you prefer and why?	3	Sep 2, 2004
A humble start on a Boid System Vocabulary -	0	Mar 21, 2010

Why less emphasis on private data?

sturlamolden

Steven D'Aprano

Hendrik van Rooyen

Gabriel Genellina

Steve Holden

Bart Ogryczak

Paul Rubin

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads