Pedantic pickling error after reload?

R

Robert

After (intended/controlled) reload or similar action on a
module/class the pickle/cPickle.dump raises errors like

pickle.PicklingError: Can't pickle <class 'somemodule.SomeClass'>:
it's not the same object as somemodule.SomeClass


Cause in pickle.py (and cPickle) is a line
"if klass is not obj:"

Shouldn't it be enough to have "if klass.__name__ !=
obj.__name__:" there?
=> a bug report/feature request?

Classes can change face anyway during pickled state, why should a
over-pedantic reaction break things here during runtime?
(So far I'd need to walk the object tree in all facets and save
against inf loops like pickle himself and re-class things .. )
 
D

Diez B. Roggisch

Am 25.02.10 18:08, schrieb Robert:
After (intended/controlled) reload or similar action on a module/class
the pickle/cPickle.dump raises errors like

pickle.PicklingError: Can't pickle <class 'somemodule.SomeClass'>: it's
not the same object as somemodule.SomeClass


Cause in pickle.py (and cPickle) is a line
"if klass is not obj:"

Shouldn't it be enough to have "if klass.__name__ != obj.__name__:" there?

No. This would alias classes of same name, but living in different modules.

So at least you need to compare these, too. I'm not sure if there aren't
even more corner-cases. Python's import-mechanism can sometimes be
rather foot-shoot-prone.
=> a bug report/feature request?

Classes can change face anyway during pickled state, why should a
over-pedantic reaction break things here during runtime?
(So far I'd need to walk the object tree in all facets and save against
inf loops like pickle himself and re-class things .. )

If anything it's a feature - and I doubt it's really needed. Because
reloading pickles with intermittend reload-module-operations is to rare
a case to be really supported IMHO.

Do yourself a favor, write a unit-test that tests the desired behavior
that makes you alter your code & then reload. This makes the problem go
away, and you have a more stable development through having more tests :)

Diez
 
R

Robert

Diez said:
Am 25.02.10 18:08, schrieb Robert:

No. This would alias classes of same name, but living in different modules.

So at least you need to compare these, too. I'm not sure if there aren't

at that point of comparison the module is already identical
("klass = getattr(mod, name)")
even more corner-cases. Python's import-mechanism can sometimes be
rather foot-shoot-prone.

still don't see a real reason against the mere module+name
comparison. same issues as during pickle.load. Just the class
object is renewed (intentionally)

If there are things with nested classes etc, the programmer will
have to rethink things on a different level: design errors. a
subject for pychecker/pylint - not for breaking pickle .dump ... ?
If anything it's a feature - and I doubt it's really needed. Because
reloading pickles with intermittend reload-module-operations is to rare
a case to be really supported IMHO.

well, reloading is the thing which I do most in coding practice :)
For me its a basic thing like cell proliferation in biology.

In my projects particularly with GUI or with python based http
serving, I typically support good live module reloadabily even
actively by some extra little "reload support code" (which fixes
up the .__class__ etc of living Windows tree, main objects,
servers ... plus a ´xisinstance´ in very few locations) - at least
I do this for the frequently changing core modules/classes.
This way I feel a edit-run cycle >2x faster when the project is
getting bigger and bigger, or when developing things out
interactively. Code is exchanged frequently while living objects
stay for long ... works well in practice.

Reentering into the same (complex) app state for evolving those
thousands of small thing (where a full parallel test coverage
doesn't work out) is a major dev time consuming factor in bigger
projects - in C, Java projects and even with other dynamic languages.
Dynamic classes are a main reason why I use Python (adopted from
Lisp long time ago; is that reload thing here possible with Ruby too?)

I typically need just 1 full app reboot on 20..50 edit-run-cycles
I guess. And just few unit test runs per release. Even for
Cython/pyximport things I added support for this reload
edit-run-cycle, because I cannot imagine to dev without this.

Just standard pickle issues stood in the way. And this patch (and
a failover from cPickle to pickle) did well so far.

Do yourself a favor, write a unit-test that tests the desired behavior
that makes you alter your code & then reload. This makes the problem go
away, and you have a more stable development through having more tests :)

this is a comfortable quasi religious theory raised often and
easily here and there - impracticable and very slow on that fine
grained code evolution level however. an interesting issue.

I do unit tests for getting stability on a much higher level
where/when things and functionality are quite wired.
Generally after having compared I cannot confirm that "write
always tests before development" ideologies pay off in practice.
"Reload > pychecker/pylint > tests" works most effectively with
Python in my opinion.
And for GUI-development the difference is max.
(min for math algorithms which are well away from data structures/OO)


Another issue regarding tests IMHO is, that one should not waste
the "validation power" of unit tests too easily for permanent low
level evolution purposes because its a little like bacteria
becoming resistent against antibiotics: Code becoming 'fit'
against artificial tests, but not against real word.
For example in late stage NASA tests of rockets and like, there is
a validation rule, that when those tests do not go through green,
there is not just a fix of the (one) cause - screwing until it
works. the whole thing is at stake.
And the whole test scheme has to be rethought too. ideally,
whenever such a late test brakes, it requires that a completely
new higher test has to be invented (in addition) ... until there
is a minimal set of "fresh green lights" which were red only
during there own tests, but never red regarding the real test run.

A rule that unit tests are used only near a release or a milestone
is healthy in that sense I think.
(And a quick edit-(real)run-interact cycle is good for speed)


Robert
 
D

Diez B. Roggisch

at that point of comparison the module is already identical ("klass =
getattr(mod, name)")

Ah, didn't know that context.
still don't see a real reason against the mere module+name comparison.
same issues as during pickle.load. Just the class object is renewed
(intentionally)

If there are things with nested classes etc, the programmer will have to
rethink things on a different level: design errors. a subject for
pychecker/pylint - not for breaking pickle .dump ... ?

I don't say it necessarily breaks anything. I simply don't know enough
about it. It might just be that back then, identity was deemed enough to
check, but you can well argue your case on the python-dev list,
providing a patch + tests that ensure there is no regression.
well, reloading is the thing which I do most in coding practice :)
For me its a basic thing like cell proliferation in biology.

I simply never do it. It has subtle issues, one of them you found,
others you say you work around by introducing actual frameworks. But you
might well forget some corner-cases & suddently chase a chimera you deem
a bug, that in fact is just an unwanted side-effect of reloading.

And all this extra complexity is only good for the process of actually
changing the code. It doesn't help you maintaining code quality.
Reentering into the same (complex) app state for evolving those
thousands of small thing (where a full parallel test coverage doesn't
work out) is a major dev time consuming factor in bigger projects - in
C, Java projects and even with other dynamic languages.
Dynamic classes are a main reason why I use Python (adopted from Lisp
long time ago; is that reload thing here possible with Ruby too?)

So what? If this kind of complex, through rather lengthy interactions
evolved state is the thing you need to work within, that's reason enough
for me to think about how to automate setting this very state up. That's
what programming is about - telling a computer to do things it can do,
which usually means it does them *much* faster & *much* more reliable
than humans do.

Frankly, I can't be bothered with clicking through layers of GUIs to
finally reach the destination I'm actually interested in. Let the
computer do that. And once I teached him how so, I just integrate that
into my test-suite.

I typically need just 1 full app reboot on 20..50 edit-run-cycles I
guess. And just few unit test runs per release. Even for
Cython/pyximport things I added support for this reload edit-run-cycle,
because I cannot imagine to dev without this.

Let me assure you - it works :)

for example yesterday, I create a full CRUD-interface for a web-app
(which is the thing I work on mostly these days) without *once* taking a
look at the browser. I wrote actions, forms, HTML, and tests along,
developed the thing ready, asserted certain constraints and error-cases,
and once finished, fired up the browser - and he saw, it worked!

Yes, I could have written that code on the fly, hitting F5 every few
seconds/minutes to see if things work out (instead of just running the
specific tests through nose) - and once I'd be finished, I didn't have
anything permanent that ensured the functionality over time.
this is a comfortable quasi religious theory raised often and easily
here and there - impracticable and very slow on that fine grained code
evolution level however. an interesting issue.

To me, that's as much as an religious statement often heard by people
that aren't (really) into test-driven development. By which I personally
don't mean the variant where one writes tests first, and then code. I
always develop both in lock-step, sometimes introducing a new feauter
first in my test as e.g. new arguments, or new calls, and then
implementing them, but as often the other way round.

The argument is always a variation of "my problem is to complicated, the
code-base to interviened to make it possible to test this".

I call this a bluff. You might work with a code-base that makes it
harder than needed to write tests for new functionality. But then, most
of the time this is a sign of lack of design. Writing with testability
in mind makes you think twice about how to proper componentize your
application, clearly separate logic from presentation, validates
API-design because using the API is immediatly done when writing the
tests you need, and so forth.
I do unit tests for getting stability on a much higher level where/when
things and functionality are quite wired.
Generally after having compared I cannot confirm that "write always
tests before development" ideologies pay off in practice.
"Reload > pychecker/pylint > tests" works most effectively with Python
in my opinion.
And for GUI-development the difference is max.
(min for math algorithms which are well away from data structures/OO)

As I said, I mainly do web these days. Which can be considered GUIs as
well. Testing the HTTP-interface is obviously easier & possible, and
what I described earlier.

But we also use selenium to test JS-driven interfaces, as now the
complexity of the interface rises, with all the bells & whistles of
ajaxiness and whatnot.

Another issue regarding tests IMHO is, that one should not waste the
"validation power" of unit tests too easily for permanent low level
evolution purposes because its a little like bacteria becoming resistent
against antibiotics: Code becoming 'fit' against artificial tests, but
not against real word.

That's why I pull in the real world as well. I don't write unit-tests
only (in fact, I don't particularily like that term, because of it's
narrow-minded-ness), I write tests for whatever condition I envision
*or* encounter.

If anything that makes my systems fail is reproducable, it becomes a new
test - and ensures this thing isn't ever happening again.

Granted, though: there are things you can't really test, especially in
cases where you interact with different other agents that might behave
(to you) erratically.

I've done robot developent as well, and of course testing e.g. an
acceleration ramp dependend on ground conditions isn't something a
simple unit-test can reproduce.

but then... I've written some, and made sure the robot was in a
controlled environment when executing them :)

All in all, this argument is *much* to often used as excuse to simply
not go to any possible length to make your system testable as far as it
possibly can be. And in my experience, that's further than most people
think. And as a consequence, quality & stability as well as design of
the application suffer.
A rule that unit tests are used only near a release or a milestone is
healthy in that sense I think.
(And a quick edit-(real)run-interact cycle is good for speed)

Nope, not in my opinion. Making tests an afterthought may well lead to
them being written carelessly, not capturing corner-cases you
encountered while actually developing, and I don't even buy the speed
argument, as I already said - most of the times, the computer is faster
than you setting up the environment for testing, how complex ever that
may be.

Testing is no silver bullet. But it's a rather mighte sword.. :)

Diez
 
R

Robert

well, reloading is the thing which I do most in coding practice :)
I simply never do it. It has subtle issues, one of them you found,
others you say you work around by introducing actual frameworks. But you
might well forget some corner-cases & suddently chase a chimera you deem
a bug, that in fact is just an unwanted side-effect of reloading.

well, at dev time there is a different rule: the more bugs, the better

(they can raise/indicate certain design and coding weaknesses)
And all this extra complexity is only good for the process of actually
changing the code. It doesn't help you maintaining code quality.

neither does a improved editor, interactive/debugging/reload
scheme replace tests, nor do tests replace them the other way
around. different dimensions. just the benefactions on all levels
radiate recursively of course ...
e.g. by a good reload scheme one can even work the tests out
better (and debug more efficiently when the tests bump).

that little reload support code is a rather constant small stub
compared to the app size (unless with trivial 1-day apps maybe).
Most 'utility' modules don't need extra care at all.
Plus maybe 1 .. 5 extra lines per few frequently changed GUI
classes (when well organized) and some 5..10 lines for
preserving/re-fixing the few application data anchors. Thats all.
No need to get it fully consistent, as serves its purpose when
editing during runtime is possible in 'most cases'. And most edit
cases during debug sessions are typically just small fixes and
touches of function code. One fix revealing the next follow-up bug
... beautifying things .. as it is.
critical data structure changes are very rare occasions.

A general reload scheme ("edit at runtime") zeroes out most
effectively a core time consumer while exploring, iterating,
debugging, smoothing ..
On the time scale of these tasks, this effect can in my opinion by
far not be matched equivalently by setup code of whatever kind in
non-trivial apps. (As I did/do this before or with less dynamic
programming languages)
Let me assure you - it works :)

for example yesterday, I create a full CRUD-interface for a web-app
(which is the thing I work on mostly these days) without *once* taking a
look at the browser. I wrote actions, forms, HTML, and tests along,
developed the thing ready, asserted certain constraints and error-cases,
and once finished, fired up the browser - and he saw, it worked!

Yes, I could have written that code on the fly, hitting F5 every few
seconds/minutes to see if things work out (instead of just running the
specific tests through nose) - and once I'd be finished, I didn't have
anything permanent that ensured the functionality over time.

well in this 1-day example you coded a thing which obviously you
had already done similarly several times. still I guess, you had
some debug session too. some exploration of new things and new
aspects. profiting e.g. particularly from the Python interactive /
interactive debugger, post mortem etc. ..
unless you type so perfect from scratch as that guy in Genesis 1 :)
To me, that's as much as an religious statement often heard by people
that aren't (really) into test-driven development. By which I personally
don't mean the variant where one writes tests first, and then code. I
always develop both in lock-step, sometimes introducing a new feauter
first in my test as e.g. new arguments, or new calls, and then
implementing them, but as often the other way round.

The argument is always a variation of "my problem is to complicated, the
code-base to interviened to make it possible to test this".

well, nothing against preaching about tests ;-) , unless its too
much.
like with every extreme there is also a threshold where you don't
draw anymore at the bottom line by adding more tests. there are
costs too. other bottle necks ...

its not against test writing for testing/validating/stabilizing
and other indirect high-level benefactions. there are simply other
dimensions too, which are worth a thought or two. the question
about a good reload scheme is more oriented towards the
debugging/interactive/exploration/editing level. things, where
particularly Python opens new dimensions by its superior dynamic
and self-introspective nature.
I call this a bluff. You might work with a code-base that makes it
harder than needed to write tests for new functionality. But then, most
of the time this is a sign of lack of design. Writing with testability
in mind makes you think twice about how to proper componentize your
application, clearly separate logic from presentation, validates
API-design because using the API is immediatly done when writing the
tests you need, and so forth.

yes, tests writing can also induce a better code modularization.
a good editor, good debugging/reload scheme etc also radiate...
the test runner can be connected to the post mortem
debugger/interactive and so on.
As I said, I mainly do web these days. Which can be considered GUIs as
well. Testing the HTTP-interface is obviously easier & possible, and
what I described earlier.

But we also use selenium to test JS-driven interfaces, as now the
complexity of the interface rises, with all the bells & whistles of
ajaxiness and whatnot.

(the CRUD approach on 'form handling IO level' as it is typically
more simple regarding tests writing than GUI programming - because
of the atomic operations and straight interface. similar like alg
and I/O testing.
While test writing for a flattery JS/CSS-heavy multi-language
multi-state web GUIs (with subtle user interactions) is perhaps
similar complex than doing it for a desktop GUI app I think.
)
Nope, not in my opinion. Making tests an afterthought may well lead to
them being written carelessly, not capturing corner-cases you

Anyway one can formulate/write tests on each error/problem/design
question which one thinks is worth a test.

A interesting question may be however: if the tests (both: unit
tests and auto code checks) should be _run_ permanently - in order
to lets say have a 'zero test dump status' overall every few
minutes, at the time scale of editing/exploring/debugging ?

I think that one doesn't loose on the savety net effect, but one
wins on overall validation power of the tests, when one _uses_ the
(new and old) tests not too often: more rarely/later/before the
'release'. because after many things are rewired in the code
(incl. test code) for a milestone/release step or, each bump which
arises fresh, lets you think/cure about the network effects and
deeper effect of errors in common, similar contexts etc.

While when one makes the code fit against the 'few artificial
tests' (which are always very/too few!) too fast on the wrong time
scale, its like in that example of quick antibiotica
application/abuse: the cure for the bumps then tend to be too
short sighted. symptom curing. while the clever bugs arise in
background ...

having fresh written tests unused for some time is no problem,
because they will induce there own debug session sooner or later ..
Testing is no silver bullet. But it's a rather mighte sword.. :)

I'd say testing has its place amongst other things and dimensions
like (incomplete):

Editor: type, browse
Language: formulate
Interactive: inspect, try
Debug: inspect, fix
Reload: fix, iterate, explore, clean
Design: organize
Code checks: stabilize
Unit tests: stabilize
Feedback: realize

Each issue can be improved. effective in overall speed. He saw:

If you have no good editor there is some 1.5 .. 2 x less dev
speed. If you have no Python (Ruby, Groovy...) there is some 1.5
... 2 x less dev speed. If you have no good Interactive/Debugging
there is some 1.5 .. 2 x less dev speed. If you have no improved
reload scheme there is another 1.5 .. 2 x less dev speed. If you
have no good design scheme there is another 1.5 .. 2 x less dev
speed. If you have no good code checks there is another 1.5 .. 2 x
less dev speed. If you have no good test scheme there is another
1.5 .. 2 x less dev speed. If you have no good bug report scheme
there is another 1.5 .. 2 x less dev speed. ...

A improved reload scheme may even speed up at the center of the
development wheel: iteration. I guess I underrated...


Robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top