Best practise implementation for equal by value objects

Slaunger · Aug 6, 2008

Hi,

I am new here and relatively new to Python, so be gentle:

Is there a recommended generic implementation of __repr__ for objects
equal by value to assure that eval(repr(x)) == x independet of which
module the call is made from?

Example:

class Age:

def __init__(self, an_age):
self.age = an_age

def __eq__(self, obj):
self.age == obj.age

def __repr__(self):
return self.__class__.__name__ + \
"(%r)" % self.age

age_ten = Age(10)
print repr(age_ten)
print eval(repr(age_ten))
print eval(repr(age_ten)).age

Running this gives

Age(10)
Age(10)
10

Exactly as I want to.

The problem arises when the Age class is iomported into another module
in another package as then there is a package prefix and the above
implementation of __repr__ does not work.

I have then experimented with doing somthing like

def __repr__(self):
return self.__module__ + '.' + self.__class__.__name__ +
"(%r)" % self.age

This seems to work when called from the outside, but not from the
inside of the module. That is, if I rerun the script above the the
module name prefixed to the representation I get the following error

Traceback (most recent call last):
File "valuetest.py", line 15, in <module>
print eval(repr(age_ten))
__main__.Age(10)
File "<string>", line 1, in <module>
NameError: name '__main__' is not defined

This is pretty annoying.

My question is: Is there a robust generic type of implementation of
__repr__ which I can use instead?

This is something I plan to reuse for many different Value classes, so
I would like to get it robust.

Thanks,
Slaunger

Terry Reedy · Aug 6, 2008

Slaunger said:
Hi,

I am new here and relatively new to Python, so be gentle:

Is there a recommended generic implementation of __repr__ for objects
equal by value to assure that eval(repr(x)) == x independet of which
module the call is made from?

The CPython implementation gives up on that goal and simply prints
<modname.classname object at address> for at least two reasons ;-).

1. In general, it require fairly sophisticated analysis of __init__ to
decide what representation of what attributes to include and decide if
the goal is even possible. If an attribute is an instance of a user
class, then *its* __init__ needs to be analyzed. If an attribute is a
module, class, or function, there is no generic evaluable representation.

2. Whether eval(repr(x)) even works (returns an answer) depends on
whether the name bindings in the globals and locals passed to eval
(which by default are the globals and locals of the context of the eval
call) match the names used in the repr. You discovered that to a first
approximation, this depends on whether the call to repr comes from
within or without the module containing the class definition. But the
situation is far worse. Consider 'import somemod as m'. Even if you
were able to introspect the call and determine that it did not come from
somemod**, prepending 'somemod.' to the repr *still* would not work.
Or, the call to repr could come from one context, the result saved and
passed to another context with different name bindings, and the eval
call made there. So an repr that can be eval'ed in any context is hopeless.

If this is a practical rather than theoretical question, then use your
first repr version that uses the classes definition name and only eval
the result in a context that has that name bound to the class object.

from mymod import Age
#or
import mymod
Age = mymod.Age

#in either case
eval(repr(Age(10))) == Age(10)

class Age:

def __init__(self, an_age):
self.age = an_age

def __eq__(self, obj):
self.age == obj.age

def __repr__(self):
return self.__class__.__name__ + \
"(%r)" % self.age

**
While such introspection is not part of the language, I believe one
could do it in CPython, but I forgot the details. There have been
threads like 'How do I determine the caller function' with answers to
that question, and I presume the module of the caller is available also.

Terry Jan Reedy

John Krukoff · Aug 6, 2008

Hi,

I am new here and relatively new to Python, so be gentle:

Is there a recommended generic implementation of __repr__ for objects
equal by value to assure that eval(repr(x)) == x independet of which
module the call is made from?

Example:

class Age:

def __init__(self, an_age):
self.age = an_age

def __eq__(self, obj):
self.age == obj.age

def __repr__(self):
return self.__class__.__name__ + \
"(%r)" % self.age

age_ten = Age(10)
print repr(age_ten)
print eval(repr(age_ten))
print eval(repr(age_ten)).age

Running this gives

Age(10)
Age(10)
10

Exactly as I want to.

The problem arises when the Age class is iomported into another module
in another package as then there is a package prefix and the above
implementation of __repr__ does not work.

I have then experimented with doing somthing like

def __repr__(self):
return self.__module__ + '.' + self.__class__.__name__ +
"(%r)" % self.age

This seems to work when called from the outside, but not from the
inside of the module. That is, if I rerun the script above the the
module name prefixed to the representation I get the following error

Traceback (most recent call last):
File "valuetest.py", line 15, in <module>
print eval(repr(age_ten))
__main__.Age(10)
File "<string>", line 1, in <module>
NameError: name '__main__' is not defined

This is pretty annoying.

My question is: Is there a robust generic type of implementation of
__repr__ which I can use instead?

This is something I plan to reuse for many different Value classes, so
I would like to get it robust.

Thanks,
Slaunger

Are you really sure this is what you want to do, and that a less tricky
serialization format such as that provided by the pickle module wouldn't
work for you?

Steven D'Aprano · Aug 7, 2008

Hi,

I am new here and relatively new to Python, so be gentle:

Is there a recommended generic implementation of __repr__ for objects
equal by value to assure that eval(repr(x)) == x independet of which
module the call is made from?

In general, no.

....

My question is: Is there a robust generic type of implementation of
__repr__ which I can use instead?

This is something I plan to reuse for many different Value classes, so I
would like to get it robust.

I doubt you could get it that robust, nor is it intended to be.

eval(repr(obj)) giving obj is meant as a guideline, not an invariant --
there are many things that can break it. For example, here's a module
with a simple class:

# Parrot module
class Parrot(object):
def __repr__(self):
return "parrot.Parrot()"
def __eq__(self, other):
# all parrots are equal
return isinstance(other, Parrot)

Now let's use it:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
NameError: name 'parrot' is not defined

If you look at classes in the standard library, they often have reprs
like this:
'<timeit.Timer instance at 0xb7f14bcc>'

Certainly you can't expect to successfully eval that!

I believe the recommendation for eval(repr(obj)) to give obj again is
meant as a convenience for the interactive interpreter, and even there
only for simple types like int or list. If you can do it, great, but if
it doesn't work, so be it. You're not supposed to rely on it, and it's
not meant as a general way to serialize classes.

Slaunger · Aug 7, 2008

The CPython implementation gives up on that goal and simply prints
<modname.classname object at address> for at least two reasons ;-).

1. In general, it require fairly sophisticated analysis of __init__ to
decide what representation of what attributes to include and decide if
the goal is even possible. If an attribute is an instance of a user
class, then *its* __init__ needs to be analyzed. If an attribute is a
module, class, or function, there is no generic evaluable representation.

OK, the situation is more complicated than that then. In the case here
though,
the attributes would always be sinmple bulit-in types, where
eval(repr(x))==x
or, where the attribute is a user-defined equal-by-value class, that I
have
control over.

The classes I am making as struct type classes with some added
functionlity for
human readable string representation, packing into a stream or
unpacking from a stream
using a "private" class Struct.

I come from a Java and JUnit world, where, if I am used to always
overriding the default reference based implementations of the
equals(), toString(),
and hashCode() methods for "equals-by-value" objects such that they
work well
and efficient in, e.g., hash maps.

With my swich-over to Python, I looked for equivalent features and
stumbled over the
eval(repr(x))==x recommendation. It is not that I actually (yet) need
the repr implementations,
but mostly because I find the condition very useful in PyUnit to check
in a test that I have remembered
to initialize all instance fields in __init__ and that I have
remembered to include all relevant
attributes in the __eq__ implementation.

Whereas this worked fine in a unit test module dedicated to only test
the specific module, the test failed
when called from other test package modules, wrapping the unit tests
from several unit test modules.

2. Whether eval(repr(x)) even works (returns an answer) depends on
whether the name bindings in the globals and locals passed to eval
(which by default are the globals and locals of the context of the eval
call) match the names used in the repr. You discovered that to a first
approximation, this depends on whether the call to repr comes from
within or without the module containing the class definition. But the
situation is far worse. Consider 'import somemod as m'. Even if you
were able to introspect the call and determine that it did not come from
somemod**, prepending 'somemod.' to the repr *still* would not work.
Or, the call to repr could come from one context, the result saved and
passed to another context with different name bindings, and the eval
call made there. So an repr that can be eval'ed in any context is hopeless.

Ok, nasty stuff

If this is a practical rather than theoretical question, then use your
first repr version that uses the classes definition name and only eval
the result in a context that has that name bound to the class object.

from mymod import Age
#or
import mymod
Age = mymod.Age

#in either case
eval(repr(Age(10))) == Age(10)

Yes, it is most from a practicl point of view, altough I was surprised
that I could not find more material on it in the Python documentation
or mailing groups, and I moight just do what you suggest in the unit
test modules to at least make it robust in that context.

Hmm... a bit of a dissapointment for me that this cannot be done
cleaner

**
While such introspection is not part of the language, I believe one
could do it in CPython, but I forgot the details. There have been
threads like 'How do I determine the caller function' with answers to
that question, and I presume the module of the caller is available also.

OK, I think CPython, for the moment, is too much new stuff to dig into
right now.
Just grasping some of all the possibilities in the API, and how to do
things the right way
is giving me enough challenges for now...

Terry Jan Reedy

Again, thank you for your thorough answer,

Slaunger

Slaunger · Aug 7, 2008

Are you really sure this is what you want to do, and that a less tricky
serialization format such as that provided by the pickle module wouldn't
work for you?

Well, it is not so much yet for serialization (although i have not yet
fully understood the implications), it is more because
I think the eval(repr(x))==x is a nice unit test to make sure my
constructor and equals method is implemented correctly (that I have
rememebered all attributes in their implementations).

As mentioned above, I may go for a more pragmatic approach, where i
only use repr if it "standard" imported

Cheers,
Slaunger

Slaunger · Aug 7, 2008

In general, no.

...
OK.

I doubt you could get it that robust, nor is it intended to be.

eval(repr(obj)) giving obj is meant as a guideline, not an invariant --
there are many things that can break it. For example, here's a module
with a simple class:

OK, I had not fully understood the implications of 'not' implementing
__repr__
such that eval(repr(x)) == x, so I just tried to make it work to make
sure
life would be easy for me and my object as I went further into the
Python jungle

As mentioned above, i also find the eval(repr(x))==x condition
convenient from
a unit test point of view.

# Parrot module
class Parrot(object):
def __repr__(self):
return "parrot.Parrot()"
def __eq__(self, other):
# all parrots are equal
return isinstance(other, Parrot)

Now let's use it:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
NameError: name 'parrot' is not defined

OK, I see, but this isn't exactly eval(repr(x))==x but
s = repr(x)
eval(s) == x

so, of course, is s is deleted in between it won't work.

In my implementation I only expect this should work as a one-liner.

If you look at classes in the standard library, they often have reprs
like this:

'<timeit.Timer instance at 0xb7f14bcc>'

Yes, I noticed that. But the example here is also an object, which is
equal by reference, not value. And for these
it does not make so much sense to evaluate the representation.

Certainly you can't expect to successfully eval that!

I believe the recommendation for eval(repr(obj)) to give obj again is
meant as a convenience for the interactive interpreter, and even there
only for simple types like int or list. If you can do it, great, but if
it doesn't work, so be it. You're not supposed to rely on it, and it's
not meant as a general way to serialize classes.

OK, I will put less emphasis on it in the future.

Thank you for taking your time to answer.

Slaunger

Terry Reedy · Aug 7, 2008

Slaunger said:
On 6 Aug., 21:36, Terry Reedy <[email protected]> wrote:

OK, the situation is more complicated than that then. In the case here
though,
the attributes would always be sinmple bulit-in types, where
eval(repr(x))==x
or, where the attribute is a user-defined equal-by-value class, that I
have
control over.

I think most would agree that a more accurate and informative
representation is better than a general representation like Pythons
default. For instance,range(2, 10, 2)

is nicer than <class 'range' object at ######>.

So when the initializers for instances are all 'nice' (as for range), go
for it (as in 'Age(10)'). And test it as you are by eval'ing the rep.
Just accept that the eval will only work in contexts with the class name
bound to the class. For built-in like range, it always is, by default
-- unless masked by another assignment!

Terry Jan Reedy

Paul Rubin · Aug 7, 2008

Terry Reedy said:
So when the initializers for instances are all 'nice' (as for range),
go for it (as in 'Age(10)'). And test it as you are by eval'ing the
rep. Just accept that the eval will only work in contexts with the
class name bound to the class. For built-in like range, it always is,
by default -- unless masked by another assignment!

Eval is extremely dangerous. Think of data from untrusted sources,
then ask yourself how well you really know where ALL your data came
from. It's preferable to avoid using it that way. There have been a
few "safe eval" recipes posted here and at ASPN. It would be good if
one of them made it into the standard library. Note that pickle
(which would otherwise be an obious choice for this) has the same
problems, though not as severely as flat-out evalling something.

Slaunger · Aug 8, 2008

Eval is extremely dangerous. Think of data from untrusted sources,
then ask yourself how well you really know where ALL your data came
from. It's preferable to avoid using it that way. There have been a
few "safe eval" recipes posted here and at ASPN. It would be good if
one of them made it into the standard library. Note that pickle
(which would otherwise be an obious choice for this) has the same
problems, though not as severely as flat-out evalling something.

Thank you for pointing out the dangers of eval. I think you are right
to
caution about it. In my particular case it is a closed-loop system, so
no
danger there, but that certainly could have been an issue.

That caution should perhaps be mentioned in
http://docs.python.org/lib/built-in-funcs.html

Slaunger · Aug 8, 2008

I think most would agree that a more accurate and informative
representation is better than a general representation like Pythons
default. For instance,
>>> a=range(2,10,2) # 3.0
>>> a
range(2, 10, 2)

is nicer than <class 'range' object at ######>.

So when the initializers for instances are all 'nice' (as for range), go
for it (as in 'Age(10)'). And test it as you are by eval'ing the rep.
Just accept that the eval will only work in contexts with the class name
bound to the class. For built-in like range, it always is, by default
-- unless masked by another assignment!

OK, i am encouraged to carry on my quest with the eval(repr)) for my
'nice' classes.
I just revisited the documentation for eval and noticed that there are
optional globals
and locals name space variables, that one could specify:

http://docs.python.org/lib/built-in-funcs.html

Quite frankly I do not understand how to make use of these parameters,
but it is my feeling
that if I enforce a convention of always specifying the globals/locals
parameter in a specific
manner:
assert eval(repr(x), globals, locals) == x
would work independent of how I have imported the module under test.

Now, I just need to figure out if this is right and how to specify the
globals and locals if that is not too cumbersome...
or maybe I am just over-engineering...

Steven D'Aprano · Aug 8, 2008

OK, i am encouraged to carry on my quest with the eval(repr)) for my
'nice' classes.
I just revisited the documentation for eval and noticed that there are
optional globals
and locals name space variables, that one could specify:

http://docs.python.org/lib/built-in-funcs.html

Quite frankly I do not understand how to make use of these parameters,
but it is my feeling
that if I enforce a convention of always specifying the globals/locals
parameter in a specific
manner:
assert eval(repr(x), globals, locals) == x would work independent of how
I have imported the module under test.

Now, I just need to figure out if this is right and how to specify the
globals and locals if that is not too cumbersome... or maybe I am just
over-engineering...

I think it is infeasible for the repr() of an object to know where it was
imported from. Including the globals and locals in the call to eval()
won't help you, because they can have changed between the time you
created the instance and the time you call repr(). Consider:

True

So far so good! But now watch this, starting in a fresh session:

What should repr(x1) and repr(x2) be, for your invariant eval(repr(x))==x
to hold?

It gets better (or worse):

alist = [None, t, None]
del t, timedate
x3 = alist[1](20, 21, 22)
assert x1 == x2 == x3

Click to expand...

Click to expand...

What should the repr() of x1, x2, x3 be now?

Bringing this back to the unittests... as I see it, there's an easy way
to solve your problem. In the unittest, just do something like the
following:

# repr(x) looks like "module.class(arg)",
# but we actually import it as package.module.submodule.class
module = package.module.submodule
assert eval(repr(x)) == x

I think that is all you need to do.

Getting a callable for any value?	1	May 29, 2013
Unit test failing please help	2	Aug 26, 2011
reporting proxy porting problem	0	Nov 28, 2013
cannot find object instance	3	Aug 28, 2008
Class decorator to capture the creation and deletion of objects	0	Feb 25, 2014
Passing flask textbox value to an infinite while loop	0	Jul 21, 2021
UnicodeEncodeError during repr()	3	Apr 19, 2010
advice needed for lazy evaluation mechanism	6	Nov 8, 2009

Best practise implementation for equal by value objects

Slaunger

Terry Reedy

John Krukoff

Steven D'Aprano

Slaunger

Slaunger

Slaunger

Terry Reedy

Paul Rubin

Slaunger

Slaunger

Steven D'Aprano

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads