How to validate the __init__ parameters

D

Denis Doria

Hi;

I'm checking the best way to validate attributes inside a class. Of
course I can use property to check it, but I really want to do it
inside the __init__:

class A:
def __init__(self, foo, bar):
self.foo = foo #check if foo is correct
self.bar = bar

All examples that I saw with property didn't show a way to do it in
the __init__. Just to clarify, I don't want to check if the parameter
is an int, or something like that, I want to know if the parameter do
not use more than X chars; and want to do it when I'm 'creating' the
instance; not after the creation:

a = A('foo', 'bar')

not

class A:
def __init__(self, foo = None, bar = None):
self._foo = foo
self._bar = bar
def set_foo(self, foo):
if len(foo) > 5:
raise <something>
_foo = foo
foo = property(setter = set_foo)

a = A()
a.foo = 'foo'


I thought in something like:

class A:
def __init__(self, foo = None, bar = None):
set_foo(foo)
self._bar = bar
def set_foo(self, foo):
if len(foo) > 5:
raise <something>
_foo = foo
foo = property(setter = set_foo)

But looks too much like java
 
J

Jean-Michel Pichavant

Denis said:
Hi;

I'm checking the best way to validate attributes inside a class. Of
course I can use property to check it, but I really want to do it
inside the __init__:

class A:
def __init__(self, foo, bar):
self.foo = foo #check if foo is correct
self.bar = bar

All examples that I saw with property didn't show a way to do it in
the __init__. Just to clarify, I don't want to check if the parameter
is an int, or something like that, I want to know if the parameter do
not use more than X chars; and want to do it when I'm 'creating' the
instance; not after the creation:

a = A('foo', 'bar')

not

class A:
def __init__(self, foo = None, bar = None):
self._foo = foo
self._bar = bar
def set_foo(self, foo):
if len(foo) > 5:
raise <something>
_foo = foo
foo = property(setter = set_foo)

a = A()
a.foo = 'foo'


I thought in something like:

class A:
def __init__(self, foo = None, bar = None):
set_foo(foo)
self._bar = bar
def set_foo(self, foo):
if len(foo) > 5:
raise <something>
_foo = foo
foo = property(setter = set_foo)

But looks too much like java
One possible way, straight and simple

class A:
def __init__(self, foo = None, bar = None):
if len(foo) > 5:
raise ValueError('foo cannot exceed 5 characters')
self._foo = foo
self._bar = bar


JM
 
A

Alf P. Steinbach

* Denis Doria:
I thought in something like:

class A:
def __init__(self, foo = None, bar = None):
set_foo(foo)
self._bar = bar
def set_foo(self, foo):
if len(foo) > 5:
raise <something>
_foo = foo
foo = property(setter = set_foo)

But looks too much like java

Yeah.

If member _foo has this constraint regardless then it's logically part of that
member's type, so make it a type:

class _Foo:
def __init__( self, seq ):
if seq is None:
self.items = []
elif len( seq ) > 5:
raise <something>
else:
self.items = seq

class A: # Your example
def __init__( self, foo = None, Bar = None ):
self._foo = _Foo( foo )
self._bar = bar
def set_foo( self, foo ):
self._foo = _Foo( foo )
foo = property( setter = set_foo )


Cheers & hth.,

- Alf
 
R

r0g

Denis said:
Hi;

I'm checking the best way to validate attributes inside a class. Of
course I can use property to check it, but I really want to do it
inside the __init__:

class A:
def __init__(self, foo, bar):
self.foo = foo #check if foo is correct
self.bar = bar

All examples that I saw with property didn't show a way to do it in
the __init__. Just to clarify, I don't want to check if the parameter
is an int, or something like that, I want to know if the parameter do
not use more than X chars; and want to do it when I'm 'creating' the
instance; not after the creation:

a = A('foo', 'bar')

not

class A:
def __init__(self, foo = None, bar = None):
self._foo = foo
self._bar = bar
def set_foo(self, foo):
if len(foo) > 5:
raise <something>
_foo = foo
foo = property(setter = set_foo)

a = A()
a.foo = 'foo'


I thought in something like:

class A:
def __init__(self, foo = None, bar = None):
set_foo(foo)
self._bar = bar
def set_foo(self, foo):
if len(foo) > 5:
raise <something>
_foo = foo
foo = property(setter = set_foo)

But looks too much like java



I use assertions myself e.g.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError


Dunno if this would be considered good or bad programming practice by
those more experienced than I (comment always welcome!) but it works for
me :)


Roger.
 
S

Steven D'Aprano

Hi;

I'm checking the best way to validate attributes inside a class.

There is no "best way", since it depends on personal taste.

Of
course I can use property to check it, but I really want to do it inside
the __init__:

If you "really want to do it inside the __init__", then copy the code
that you would put in the property's setter into the __init__ method. But
why do you care that the check is inside the __init__ method? All that is
really important is that the __init__ method *calls* the check, not where
the check lives.


class A:
def __init__(self, foo, bar):
self.foo = foo #check if foo is correct
self.bar = bar

Here are some ways of doing this, more or less in order of complexity and
difficulty.


(1) Don't validate at all. Just document that foo must be no more than
five characters long, and if the caller pays no attention and passes a
too-long string, any explosions that happen are their fault, not yours.

class A:
"""Class A does blah blah.
If foo is longer than five characters, behaviour is undefined.
"""
def __init__(self, foo = None, bar = None):
self.foo = foo
self.bar = bar

(This may seem silly, but for more difficult constraints which are hard
to test, it may be your only choice.)


(2) Validate once only, at initialisation time:

class A:
def __init__(self, foo = None, bar = None):
if len(foo) > 5:
raise ValueError("foo is too big")
self.foo = foo
self.bar = bar


Note that further assignments to instance.foo will *not* be validated.


(3) Move the validation into a method. This is particularly useful if the
validation is complex, or if you expect to want to over-ride it in a sub-
class.

class A:
def __init__(self, foo = None, bar = None):
self.validate(foo)
self.foo = foo
self.bar = bar
def validate(self, foo):
if len(foo) > 5:
raise ValueError("foo is too big")


Further assignments to instance.foo are still not validated.


(4) Validate on every assignment to foo by using a property. Note that
for this to work, you MUST use a new-style class. In Python 3, you don't
need to do anything special, but in Python 2.x you must inherit from
object:

class A(object):
def __init__(self, foo = None, bar = None):
self.foo = foo # calls the property setter
self.bar = bar
def _setter(self, foo):
if len(foo) > 5:
raise ValueError("foo is too big")
self._foo = foo
def _getter(self):
return self._foo
foo = property(_getter, _setter, None, "optional doc string for foo")


If you think this looks "too much like Java", well, sorry, but that's
what getters and setters look like. But note that you never need to call
the getter or setter explicitly, you just access instance.foo as normal.


(5) Use explicit Java-style getter and setter methods. This avoids using
property, so it doesn't need to be a new-style class:

class A:
def __init__(self, foo = None, bar = None):
self.setfoo(foo)
self.bar = bar
def setfoo(self, foo):
if len(foo) > 5:
raise ValueError("foo is too big")
self._foo = foo
def getfoo(self):
return self._foo

If you want to write Java using Python syntax, this may be the solution
for you. But be aware that Python programmers will laugh at you.


(5) If the constraint on foo is significant enough, perhaps it should be
made part of the type of foo.

class FooType(str): # or inherit from list, or whatever...
def __init__(self, x):
if len(x) > 5:
raise ValueError("initial value is too big, invalid FooType")


class A:
def __init__(self, foo = None, bar = None):
self.foo = FooType(foo)
self.bar = bar


Other, more complex solutions include using decorators or even metaclass
programming. Don't worry about these at this point, I'm just showing off
*wink*


Hope this helps,
 
L

Lie Ryan

Hi;

I'm checking the best way to validate attributes inside a class. Of
course I can use property to check it, but I really want to do it
inside the __init__:

class A:
def __init__(self, foo, bar):
self.foo = foo #check if foo is correct
self.bar = bar

A nice sugar to do that:

import functools

class CheckError(Exception): pass

def func_check(*argcheckers):
def _checked(func):
def _function(*args):
res = [(check(arg), check, arg) for check, arg in
zip(argcheckers, args)]
if all(r[0] for r in res):
return func(*args)
else:
raise CheckError(filter(lambda x: x[0] == False, res))
return _function
return _checked

method_check = functools.partial(func_check, lambda a: True)

##########################################################
def check_foo(arg):
return 5 <= arg <= 10

def check_bar(arg):
return 10 <= arg < 20

class A(object):
@method_check(check_foo, check_bar)
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
 
S

Steven D'Aprano

I use assertions myself e.g.

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AssertionError


Dunno if this would be considered good or bad programming practice by
those more experienced than I (comment always welcome!) but it works for
me :)


Bad practice.

Assertions are ignored when you run Python with the -O (optimization)
command line switch. Your code will behave differently if you use
assertions. So you should never use assertions for error-checking, except
perhaps for quick-and-dirty throw-away scripts.

Assertions are useful for testing invariants and "this will never happen"
conditions. A trivial example:

result = math.sin(2*math.pi*x)
assert -1 <= result <= 1

Assertions are also useful in testing, although be careful: since the
assert statement is ignored when running with -O, that means you can't
test your application when running with -O either! But do not use them
for data validation unless you're happy to run your application with no
validation at all.
 
R

r0g

Steven said:
Bad practice.

Assertions are ignored when you run Python with the -O (optimization)
command line switch. Your code will behave differently if you use
assertions. So you should never use assertions for error-checking, except
perhaps for quick-and-dirty throw-away scripts.

Assertions are useful for testing invariants and "this will never happen"
conditions. A trivial example:

result = math.sin(2*math.pi*x)
assert -1 <= result <= 1

Assertions are also useful in testing, although be careful: since the
assert statement is ignored when running with -O, that means you can't
test your application when running with -O either! But do not use them
for data validation unless you're happy to run your application with no
validation at all.


Yikes, glad to be set me straight on that one! Thanks :) It's a pity
though, I really like the way it reads. Is there anything similar with
ISN'T disabled when optimizations are turned on?

I guess I could do the following but it seems a bit ugly in comparison :(

if len(foo) <= 5: raise Exception



Roger.
 
S

Steve Holden

r0g said:
Yikes, glad to be set me straight on that one! Thanks :) It's a pity
though, I really like the way it reads. Is there anything similar with
ISN'T disabled when optimizations are turned on?

I guess I could do the following but it seems a bit ugly in comparison :(

if len(foo) <= 5: raise Exception
OK,so here's a revolutionary thought. Since you have now discovered that

a) you can't reliably rely on an AssertionError being raised by an
assert that asserts an invariant;

b) the alternative of explicitly raising an exception is at least
somewhat uglier; and

c) you have to do *something*. Er, actually, do you?

What's the exact reason for requiring that a creator argument be of a
specific type? So operations on the instances don't go wrong? Well, why
not just admit that we don't have control over everything, and just *let
things go wrong* when the wrong type is passed?

What will then happen? Why, an exception will be raised!

You might argue that the user, seeing the traceback from the exception,
won't know what to make of it. In response to which I would assert (pun
intended) that neither would the traceback produced by the assertion
failure.

Given that somebody who knows what they are talking about has to cast
their eye over a traceback to understand what's wrong with the program,
it's not much extra effort for that person realise that a method has
been called with an invalid argument type. I accept that there are
legitimate arguments in opposition to this point of view, but it has a
certain engineering efficiency (as long as we are prepared to work
inside frameworks like Django that will encapsulate any error tracebacks
so that you see all the information that you need, and the user gets
some other inexplicable error message like "A programmer probably
screwed up).

In other words, "be prepared to fix your program when it goes wrong" is
a valid alternative to trying to anticipate every single last thing that
might go wrong. This philosophy might not be appropriate for
extra-terrestrial exploration, but most Python programmers aren't doing
that.

regards
Steve
 
S

Steve Holden

r0g said:
Yikes, glad to be set me straight on that one! Thanks :) It's a pity
though, I really like the way it reads. Is there anything similar with
ISN'T disabled when optimizations are turned on?

I guess I could do the following but it seems a bit ugly in comparison :(

if len(foo) <= 5: raise Exception
OK,so here's a revolutionary thought. Since you have now discovered that

a) you can't reliably rely on an AssertionError being raised by an
assert that asserts an invariant;

b) the alternative of explicitly raising an exception is at least
somewhat uglier; and

c) you have to do *something*. Er, actually, do you?

What's the exact reason for requiring that a creator argument be of a
specific type? So operations on the instances don't go wrong? Well, why
not just admit that we don't have control over everything, and just *let
things go wrong* when the wrong type is passed?

What will then happen? Why, an exception will be raised!

You might argue that the user, seeing the traceback from the exception,
won't know what to make of it. In response to which I would assert (pun
intended) that neither would the traceback produced by the assertion
failure.

Given that somebody who knows what they are talking about has to cast
their eye over a traceback to understand what's wrong with the program,
it's not much extra effort for that person realise that a method has
been called with an invalid argument type. I accept that there are
legitimate arguments in opposition to this point of view, but it has a
certain engineering efficiency (as long as we are prepared to work
inside frameworks like Django that will encapsulate any error tracebacks
so that you see all the information that you need, and the user gets
some other inexplicable error message like "A programmer probably
screwed up).

In other words, "be prepared to fix your program when it goes wrong" is
a valid alternative to trying to anticipate every single last thing that
might go wrong. This philosophy might not be appropriate for
extra-terrestrial exploration, but most Python programmers aren't doing
that.

regards
Steve
 
S

Steven D'Aprano

Yikes, glad to be set me straight on that one! Thanks :) It's a pity
though, I really like the way it reads. Is there anything similar with
ISN'T disabled when optimizations are turned on?

Yes: an explicit test-and-raise.

if not condition:
raise Exception


I guess I could do the following but it seems a bit ugly in comparison
:(

if len(foo) <= 5: raise Exception

Write a helper function:

def fail_unless(condition, exception=Exception, msg=""):
if not condition:
raise exception(msg)


fail_unless(len(foo) <= 5, ValueError, 'string too long')
 
B

Bruno Desthuilliers

Denis Doria a écrit :
Hi;

I'm checking the best way to validate attributes inside a class. Of
course I can use property to check it, but I really want to do it
inside the __init__:

If you use a property, you'll have the validation in the initializer AND
everywhere else too. If you care about validating the attribute in the
initializer then you probably want to make sure the attribute is
_always_ valid, don't you ???
 
B

Bruno Desthuilliers

Steve Holden a écrit :
(snip)
What's the exact reason for requiring that a creator argument be of a
specific type? So operations on the instances don't go wrong? Well, why
not just admit that we don't have control over everything, and just *let
things go wrong* when the wrong type is passed?

validation isn't only about types, but that's not the point...
What will then happen? Why, an exception will be raised!

Not necessarily.

def nocheck(stuffs):
"'stuffs' is supposed to be a sequence of strings"
for s in stuffs:
do_something_with(s)


# correct call
good_stuffs = ("role1", "role2", "role3")
nocheck(good_stuffs)

# incorrect call, but the error passes silently
bad_stuffs = "role1"
nocheck(bad_stuffs)


If nocheck happens to be in the core of a huge lib or framework and
stuffs defined somwhere in a huge client app's code you didn't even
wrote, then you might spend hours tracing the bug - been here, done
that, bought the tshirt :(

IOW : while I agree that not doing anything and letting exceptions
propagate is quite often the best thing to do, there are times where
validation makes sense - specially when passing in a wrong value might
NOT raise an exception - just produce unexpected results.

My 2 cents...
 
R

r0g

Bruno said:
Steve Holden a écrit :
(snip)

validation isn't only about types, but that's not the point...


Not necessarily.




Yes, that's what I'm more concerned about. As Steve says, the easy ones
will cause an exception somewhere anyway but the really tricky bugs
might not trouble the interpreter at all and go on to cause errors and
corruption elsewhere that may not even be noticed immediately.

I'm sad there's no particularly elegant alternative (an 'ensure' keyword
maybe ;) as I do like how they read.

I think I'll carry on using them liberally anyway, armed with the
knowledge I can't rely on them in any code I distribute, and
deliberately raising errors when execution NEEDS to be stopped.

Thanks for the insight all.

Roger.
 
L

Lie Ryan

Steve Holden a écrit :
(snip)

validation isn't only about types, but that's not the point...


Not necessarily.

Let me give a more concrete example I've just stumbled upon recently:

This is a Tkinter Text widget's insert signature:

insert(self, index, chars, *args)
Insert CHARS before the characters at INDEX. An additional
tag can be given in ARGS. Additional CHARS and tags can follow in ARGS.

Let's simplify the signature to focus on one particular case (inserting
just one pair of text and tag):

insert(self, index, chars, tags)

I want to write a wrapper/subclass Tkinter Text widget, then I read
something that essentially means: tags must be a tuple-of-strings,
cannot be a list-of-strings, and if you pass string "hello" it is
interpreted as ('h', 'e', 'l', 'l', 'o') tags.

This is a bug-prone signature, I could have written (and indeed just
minutes *after* reading the warning, though I caught it before any real
harm was done, I wrote):

insert(INSERT, chars="some text", tags="the_tag")

since I was intending to subclass Text anyway; I decided to throw an
"assert isinstance(tags, tuple)" here rather than risking a bug caused
by mindlessly using string. The only time the assert would fail is when
I'm writing a bug, and there is no reason *I* would want a
character-wise tag and write them as string instead of tuple of chars.

It could have been more subtle, I might have written
insert(INSERT, chars="some text", tags="b")
and the code worked flawlessly, until I decided to copy'n'paste it to:
insert(INSERT, chars="some text", tags="em")
and wondered why I can't find the "em" tag.
 
C

Chris Rebert

On Mon, Jan 4, 2010 at 10:17 AM, Albert van der Horst
This triggers a question: I can see the traceback, but it
would be much more valuable, if I could see the arguments
passed to the functions. Is there a tool?

print(locals()) #this actually gives the bindings for the entire local namespace
#but is less work than writing something more precise by hand.

It is amazing how handy the humble print() function/statement is when debugging.

Cheers,
Chris
 
A

Albert van der Horst

Steve Holden said:
What's the exact reason for requiring that a creator argument be of a
specific type? So operations on the instances don't go wrong? Well, why
not just admit that we don't have control over everything, and just *let
things go wrong* when the wrong type is passed?

What will then happen? Why, an exception will be raised!

You might argue that the user, seeing the traceback from the exception,
won't know what to make of it. In response to which I would assert (pun
intended) that neither would the traceback produced by the assertion
failure.

Given that somebody who knows what they are talking about has to cast
their eye over a traceback to understand what's wrong with the program,
it's not much extra effort for that person realise that a method has
been called with an invalid argument type. I accept that there are
legitimate arguments in opposition to this point of view, but it has a
certain engineering efficiency (as long as we are prepared to work
inside frameworks like Django that will encapsulate any error tracebacks
so that you see all the information that you need, and the user gets
some other inexplicable error message like "A programmer probably
screwed up).

I can agree to a large extent with this reasoning.
However, let us consider an example.
I just solved the euler problem 269, something with cutting a
piece of paper at exact boundaries. At some point I have
reasoned that at least one of 4 arguments must be greater
than one, such that an infinite recursion is prevented.
Now I greatly prefer to have an assert there and then,
instead of the infinite recursion, if my reasoning is wrong.
You know it is not the only point where I did a similar
assumption, and the problem was complicated enough as it is
(for me).

This triggers a question: I can see the traceback, but it
would be much more valuable, if I could see the arguments
passed to the functions. Is there a tool?
In other words, "be prepared to fix your program when it goes wrong" is
a valid alternative to trying to anticipate every single last thing that
might go wrong. This philosophy might not be appropriate for
extra-terrestrial exploration, but most Python programmers aren't doing
that.

Funny that you make this distinction. There is less of a distinction
than you may think.
Even while in embedded programming we anticipate as much as possible.
there is also the watch dog timer, that resets the whole system for
things unforeseen, hardware failures included. You bet space crafts
have those on board.
regards
Steve

Groetjes Albert
 
P

Phlip

Steve Holden said:
What's the exact reason for requiring that a creator argument be of a
specific type? So operations on the instances don't go wrong? Well, why
not just admit that we don't have control over everything, and just *let
things go wrong* when the wrong type is passed?

Because some interfaces are "public", meaning visible within their own little
arena of modules, and some interfaces are "published", meaning visible to other
packages. Encapsulation is hierarchical - this is just a "small things go inside
big things" situation. The higher an interface's level, the more "published" it is.

Published interfaces should fail as early and clearly as possible if they are
going to fail. The obvious example here is an interface that fails and prints a
message telling a newbie how to call the package the right way. But that newbie
could be you!

(And don't forget the wall-to-wall unit tests, too;)
 
G

Gabriel Genellina

En Mon, 04 Jan 2010 15:17:04 -0300, Albert van der Horst
This triggers a question: I can see the traceback, but it
would be much more valuable, if I could see the arguments
passed to the functions. Is there a tool?

Yes, the cgitb module [1]. Despite its name it's a general purpose module.
This code:

<code>
import cgitb
cgitb.enable(format="text")

spam = []

def a(x, y):
"""This is function a"""
z = x+y
return b(z)


def b(z, n=3):
"""This is function b"""
return c(foo=z*n)


def c(foo=0, bar=1):
"""This is function c"""
baz = foo+bar
spam.somenamethatdoesnotexist(foo+bar)


a(10, 20)
<code>

generates this error message:

A problem occurred in a Python script. Here is the sequence of
function calls leading up to the error, in the order they occurr
ed.

d:\temp\test_traceback.py in <module>()
19 baz = foo+bar
20 spam.somenamethatdoesnotexist(foo+bar)
21
22
23 a(10, 20)
a = <function a at 0x00BFEBB0>

d:\temp\test_traceback.py in a(x=10, y=20)
7 """This is function a"""
8 z = x+y
9 return b(z)
10
11
global b = <function b at 0x00BFEBF0>
z = 30

d:\temp\test_traceback.py in b(z=30, n=3)
12 def b(z, n=3):
13 """This is function b"""
14 return c(foo=z*n)
15
16
global c = <function c at 0x00BFEC30>
foo undefined
z = 30
n = 3

d:\temp\test_traceback.py in c(foo=90, bar=1)
18 """This is function c"""
19 baz = foo+bar
20 spam.somenamethatdoesnotexist(foo+bar)
21
22
global spam = []
spam.somenamethatdoesnotexist undefined
foo = 90
bar = 1
<type 'exceptions.AttributeError'>: 'list' object has no attribu
te 'somenamethatdoesnotexist'
__class__ = <type 'exceptions.AttributeError'>
__dict__ = {}
__doc__ = 'Attribute not found.'
... more exception attributes ...

The above is a description of an error in a Python program. Her
e is
the original traceback:

Traceback (most recent call last):
File "d:\temp\test_traceback.py", line 23, in <module>
a(10, 20)
File "d:\temp\test_traceback.py", line 9, in a
return b(z)
File "d:\temp\test_traceback.py", line 14, in b
return c(foo=z*n)
File "d:\temp\test_traceback.py", line 20, in c
spam.somenamethatdoesnotexist(foo+bar)
AttributeError: 'list' object has no attribute 'somenamethatdoes
notexist'

[1] http://docs.python.org/library/cgitb.html
 
A

Albert van der Horst

On Mon, Jan 4, 2010 at 10:17 AM, Albert van der Horst


print(locals()) #this actually gives the bindings for the entire local namespace
#but is less work than writing something more precise by hand.

It is amazing how handy the humble print() function/statement is
when debugging.

It is also amazing how much output is generated in this way in a
typical Euler program (that require seconds, if not minutes solid
computer time on Giga Hz machines). I have to routinely clean out
my 30 Gbyte home directory. Not so long ago generating so much debug
output wasn't an option!

What I hoped for is something along the line of what you can do
with a coredump in a classical debugger. Especially if recursion
runs dozens of levels deep, it is difficult to keep track.

But locals() is nice function, thanks.
Cheers,
Chris

Groetjes Albert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top