Checking length of each argument - seems like I'm fighting Python

B

Brendan

There must be an easy way to do this:

For classes that contain very simple data tables, I like to do
something like this:

class Things(Object):
def __init__(self, x, y, z):
#assert that x, y, and z have the same length

But I can't figure out a _simple_ way to check the arguments have the
same length, since len(scalar) throws an exception. The only ways
around this I've found so far are

a) Cast each to a numeric array, and check it's dimension and shape.
This seems like far too many dependencies for a simple task:

def sLen(x):
"""determines the number of items in x.
Returns 1 if x is a scalar. Returns 0 if x is None
"""
xt = numeric.array(x)
if xt == None:
return 0
elif xt.rank == 0:
return 1
else:
return xt.shape[0]

b) use a separate 'Thing' object, and make the 'Things' initializer
work only with Thing objects. This seems like way too much structure
to me.

c) Don't bother checking the initializer, and wait until the problem
shows up later. Maybe this is the 'dynamic' way, but it seems a little
fragile.

Is there a simpler way to check that either all arguments are scalars,
or all are lists of the same length? Is this a poor way to structure
things? Your advice is appreciated
Brendan
 
M

Mike Erickson

* Brendan ([email protected]) wrote:
[...]
Is there a simpler way to check that either all arguments are scalars,
or all are lists of the same length? Is this a poor way to structure
things? Your advice is appreciated

Disclaimer: I am new to python, so this may be a bad solution.

import types
def __init__(self,x,y,z):
isOK = False
if ( (type(x) == types.IntType) and (type(y) == types.IntType) and (type(z) == types.IntType) ):
isOK = True
if ( (type(x) == types.ListType) and (type(x) == types.ListType) and (type(x) == types.ListType) ):
if ( (len(x) == len(y)) and (len(x) == len(z)) ):
isOK = True

HTH,

mike
 
A

Alex Martelli

Brendan said:
def sLen(x):
"""determines the number of items in x.
Returns 1 if x is a scalar. Returns 0 if x is None
"""
xt = numeric.array(x)
if xt == None:
return 0
elif xt.rank == 0:
return 1
else:
return xt.shape[0]

Simpler:

def sLen(x):
if x is None: return 0
try: return len(x)
except TypeError: return 1

Depending on how you define a "scalar", of course; in most applications,
a string is to be treated as a scalar, yet it responds to len(...), so
you'd have to specialcase it.


Alex
 
S

Sam Pointon

It depends what you mean by 'scalar'. If you mean in the Perlish sense
(ie, numbers, strings and references), then that's really the only way
to do it in Python - there's no such thing as 'scalar context' or
anything - a list is an object just as much as a number is.

So, assuming you want a Things object to break if either a) all three
arguments aren't sequences of the same length, or b) all three
arguments aren't a number (or string, or whatever), this should work:

#Not tested.
class Things(object):
def __init__(self, x, y, z):
try:
if not (len(x) == len(y) and len(y) == len(z)):
raise ValueError('Things don't match up.')
except TypeError:
#one of x, y or z doesn't do __len__
#check if they're all ints, then.
if not (isinstance(x, int) and isinstance(y, int) and
isinstance(z, int)):
raise ValuError('Things don't match up')
#do stuff.

Personally, I find nothing wrong with having a separate Thing object
that can do validation for itself, and I think it's a pleasantly object
oriented solution

I'm also wondering why something like this could accept something other
than sequences if it depends on their length. If you just want to treat
non-sequences as lists with one value, then something like this is more
appropriate, and might lead to less redundancy in other places:

def listify(obj):
try:
return list(obj)
except TypeError
return [obj]

class Things(object):
def __init__(self, *args):
x, y, z = [listify(arg) for arg in args]
if not (len(x) == len(y) and len(y) == len(z)):
raise ValueError('Things don't match up')
 
A

Alex Martelli

Sam Pointon said:
So, assuming you want a Things object to break if either a) all three
arguments aren't sequences of the same length, or b) all three
arguments aren't a number (or string, or whatever), this should work:

#Not tested.
class Things(object):
def __init__(self, x, y, z):
try:
if not (len(x) == len(y) and len(y) == len(z)):
raise ValueError('Things don't match up.')

Careful though: this does NOT treat a string as a scalar, as per your
parenthetical note, but rather as a sequence, since it does respond
correctly to len(...). You may need to specialcase with checks on
something like isinstance(x,basestring) if you want to treat strings as
scalars.


Alex
 
M

Michael Spencer

Brendan wrote:
....
class Things(Object):
def __init__(self, x, y, z):
#assert that x, y, and z have the same length

But I can't figure out a _simple_ way to check the arguments have the
same length, since len(scalar) throws an exception. The only ways
around this I've found so far are
....

b) use a separate 'Thing' object, and make the 'Things' initializer
work only with Thing objects. This seems like way too much structure
to me.

Yes, but depending on what you want to do with Things, it might indeed make
sense to convert its arguments to a common sequence type, say a list. safelist
is barely more complex than sLen, and may simplify downstream steps.

def safelist(obj):
"""Construct a list from any object."""
if obj is None:
return []
if isinstance(obj, (basestring, int)):
return [obj]
if isinstance(obj, list):
return obj
try:
return list(obj)
except TypeError:
return [obj]

class Things(object):
def __init__(self, *args):
self.args = map(safelist, args)
assert len(set(len(obj) for obj in self.args)) == 1
def __repr__(self):
return "Things%s" % self.args
>>> Things(0,1,2) Things[[0], [1], [2]]
>>> Things(range(2),xrange(2),(0,1)) Things[[0, 1], [0, 1], [0, 1]]
>>> Things(None, 0,1)
Traceback (most recent call last):
File "<input>", line 1, in ?
File "C:\Documents and Settings\Michael\My
Documents\PyDev\Junk\safelist.py", line 32, in __init__
assert len(set(len(obj) for obj in self.args)) == 1
AssertionError


Michael
 
M

Mike Meyer

Brendan said:
There must be an easy way to do this:

Not necessarily.
For classes that contain very simple data tables, I like to do
something like this:

class Things(Object):
def __init__(self, x, y, z):
#assert that x, y, and z have the same length

But I can't figure out a _simple_ way to check the arguments have the
same length, since len(scalar) throws an exception. The only ways
around this I've found so far are

You've gotten a number of suggestions about how to type check
things. To me, the need to type check things indicates that your API
has problems. In particular, when you're doing it on arguments, it's
indicative of trying to do the C++ thing of dispatching based on the
types of the arguments. Given duck typing, this is nearly impossible
to do correctly.
Is there a simpler way to check that either all arguments are scalars,
or all are lists of the same length? Is this a poor way to structure
things? Your advice is appreciated

I'd say your better off providing two different interfaces to
Thing. The hard-core OO way would look like:

class Thing(object):
def use_lists(self, x, y, z):
assert len(x) == len(y) == len(z)
self.x, self.y, self.z = x, y, z
def use_scalars(self, x, y, z):
self.x, self.y, self.z = x, y, z

This allows things that type-checking can't do. For instance, the user
can treat strings as either a scalar or sequence, their choice. If you
do the type-checking, you'll have to choose one case and stick with
it.

Also, I suspect that you can actually change do_scalars to something like
self.x, self.y, self.z = [x], [y], [z]
and then the rest of your code will get simpler because it only has to
deal with lists.

You might also make use optional arguments, and check that you get passed
a proper subset:

class Thing(object):
def __init__(self, x = None, y = None, z = None, scalars = None):
if scalars:
if x is not None or y is not None or z is not None:
raise ValueError, "If present, scalars must be the only argument."
self.x, self.y, self.z = scalars
elif not (x is not None and y is not None and z is not None):
raise ValueError, "You must specify all of x, y and z"
else:
assert len(x) == len(y) == len(z)
self.x, self.y, self.z = x, y, z

etc.

<mike
 
B

Bengt Richter

There must be an easy way to do this:

For classes that contain very simple data tables, I like to do
something like this:

class Things(Object):
def __init__(self, x, y, z):
#assert that x, y, and z have the same length

But I can't figure out a _simple_ way to check the arguments have the
same length, since len(scalar) throws an exception. The only ways
around this I've found so far are

a) Cast each to a numeric array, and check it's dimension and shape.
This seems like far too many dependencies for a simple task:

def sLen(x):
"""determines the number of items in x.
Returns 1 if x is a scalar. Returns 0 if x is None
"""
xt = numeric.array(x)
if xt == None:
return 0
elif xt.rank == 0:
return 1
else:
return xt.shape[0]

b) use a separate 'Thing' object, and make the 'Things' initializer
work only with Thing objects. This seems like way too much structure
to me.

c) Don't bother checking the initializer, and wait until the problem
shows up later. Maybe this is the 'dynamic' way, but it seems a little
fragile.

Is there a simpler way to check that either all arguments are scalars,
or all are lists of the same length? Is this a poor way to structure
things? Your advice is appreciated

I'd go with c) unless you think errors that might result could be too mysterious
to diagnose or some dangerous action could result, but usually the errors won't be
very mysterious. If some dangerous action could result, you might well want to
impose a number of constraints, starting with checking input args, but you will
also want thorough unit tests.

Note that an assert statement gets eliminated from the generated code when optimization
is requested with a -O command line option, so you might want to write out the test and
exception raising explicitely to make sure it remains part of the code, if that is
what you want.

You could also define an external function to check that args conform
def __init__(self, x, y, z)
argcheck(x, y ,z) # raise exception if non-conforming

where
... assert len(set([isinstance(x, scalartype) and 'S' or
... hasattr(x, '__len__') and len(x) for x in args]))==1
...
>>> argcheck(1, 2L, 3.0)
>>> argcheck([1,2], [3,4], [4,5])
>>> argcheck([1,2], [3,4], [4,5], 6)
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File said:
>>> argcheck([1,2], [3,4], [4,5,6])
Traceback (most recent call last):
File "<stdin>", line 1, in ?
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "<stdin>", line 2, in argcheck
AssertionError

You might want to add a check on the elements of arrays, e.g. to make sure
they are all scalartypes or all complex or whatever (and so as not to
accept str elements ;-).

Note that an assert statement disappears if optimization is called for on the python
command line with -O, so you might want to code your own test of the condition and
raise an exception explicitly if appropriate.

If the assert is enough, you can of course put it directly in the __init__, e.g.,

def __init__(self, x, y, z)
assert len(set([isinstance(arg, scalartype) and 'S' or
hasattr(arg, '__len__') and len(arg) for arg in (x,y,z)]))==1
...

BTW, I'm using len(set(list_of_stuff_that_should_all_be_equal))==1 to test that they are equal,
since if not, there would be more than one element in the set. So there should either be
a bunch of 'S's in the list or a bunch of lengths that are all equal, so a mix wouldn't give
one element either. But this is a lot of calling for a simple check that could be written to
short-cut lots faster for the specific case or x,y,z args, e.g., (untested)

ii = isinstance; st = scalartype; ha = hasattr; L = '__len__' # save me typing ;-)
def __init__(self, x, y, z):
assert ii(x,st) and ii(y,st) and ii(z,st) or ha(x,L) and ha(y,L) and ha(z,L) and len(x)==len(y)==len(z)
...

If you wanted to check that all the elements of a passed vector x were scalars, you could
write (untested)
assert sum(isinstance(_, scalartype) for _ in x)==len(x)
since True==1 as a subtype of integer.
>>> sum(isinstance(_, scalartype) for _ in [1, 2.0, 3L]) 3
>>> sum(isinstance(_, scalartype) for _ in [1, 2.0, 3j]) 2
>>> sum(isinstance(_, scalartype) for _ in [1, 2.0, []])
2
compared == len(thething) should work
Of course, your next step in using vectors might give you the check for free,
so no need for redundant checking. It's generally faster to let your args try to quack
like the ducks you need, if possible and safe.


Regards,
Bengt Richter
 
B

Brendan

Thank you all for your help. Alex's listify does the job well. I will
reconsider using an atomic "Thing" class with Michaels' safeList.
Bengt wins the prize for reducing sLen to one line!

I still feel like I'm working against the grain somewhat, (Mike's
right, I am coming at this with a "C++ mindset) but at least I have
tools to do it efficiently :)

Brendan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top