Thoughts on using isinstance

A

abcd

In my code I am debating whether or not to validate the types of data
being passed to my functions. For example

def sayHello(self, name):
if not name:
rasie "name can't be null"
if not isinstance(name, str):
raise "name must be a string"
print "Hello " + name

Is the use of isinstance a "bad" way of doing things? is it a "heavy"
operation? for example, if I use this in each function validate input
will it slow things down a lot?

just curious how you might handle this type of situation (other than
not validating at all).

thanks
 
M

Maxim Sloyko

In my code I am debating whether or not to validate the types of data
being passed to my functions. For example

def sayHello(self, name):
if not name:
rasie "name can't be null"
if not isinstance(name, str):
raise "name must be a string"
print "Hello " + name

Is the use of isinstance a "bad" way of doing things? is it a "heavy"
operation? for example, if I use this in each function validate input
will it slow things down a lot?

just curious how you might handle this type of situation (other than
not validating at all).

thanks

My opinion is that validation is generally good. However, you have to
make it not too strict.
For example, instead of

print "Hello " + name

you could have written

print "Hello " + str(name)

In this case requirement isinstance() will be too strict. The only
thing you have to check is that hasattr(name, "__str__") and
callable(name.__str__)

In this case you can have validation, while at the same time enjoy full
flexibility of dynamic typing.
 
S

Steve Holden

abcd said:
In my code I am debating whether or not to validate the types of data
being passed to my functions. For example

def sayHello(self, name):
if not name:
rasie "name can't be null"
if not isinstance(name, str):
raise "name must be a string"
print "Hello " + name

Is the use of isinstance a "bad" way of doing things? is it a "heavy"
operation? for example, if I use this in each function validate input
will it slow things down a lot?

just curious how you might handle this type of situation (other than
not validating at all).

thanks
The "Python way" is to validate by performing the operations you need to
perform and catching any exceptions that result. In the case of your
example, you seem to be saying that you'd rather raise your own
exception (which, by the way, should really be a subclass of Exception,
but we will overlook that) that relying on the interpreter to raise a
ValueError or a TypeError. Is there really any advantage to this? You
increase your code size and add *something* to execution time with
little real purpose.

People coming to Python after C++ or some similar language that allows
or requires parameter type declarations often don't feel comfortable
taking this direction to start with, but it works well for most of us.

regards
Steve
 
S

Steve Holden

abcd said:
In my code I am debating whether or not to validate the types of data
being passed to my functions. For example

def sayHello(self, name):
if not name:
rasie "name can't be null"
if not isinstance(name, str):
raise "name must be a string"
print "Hello " + name

Is the use of isinstance a "bad" way of doing things? is it a "heavy"
operation? for example, if I use this in each function validate input
will it slow things down a lot?

just curious how you might handle this type of situation (other than
not validating at all).

thanks
The "Python way" is to validate by performing the operations you need to
perform and catching any exceptions that result. In the case of your
example, you seem to be saying that you'd rather raise your own
exception (which, by the way, should really be a subclass of Exception,
but we will overlook that) that relying on the interpreter to raise a
ValueError or a TypeError. Is there really any advantage to this? You
increase your code size and add *something* to execution time with
little real purpose.

People coming to Python after C++ or some similar language that allows
or requires parameter type declarations often don't feel comfortable
taking this direction to start with, but it works well for most of us.

regards
Steve
 
D

Duncan Booth

abcd said:
In my code I am debating whether or not to validate the types of data
being passed to my functions. For example

def sayHello(self, name):
if not name:
rasie "name can't be null"
if not isinstance(name, str):
raise "name must be a string"
print "Hello " + name

Is the use of isinstance a "bad" way of doing things? is it a "heavy"
operation? for example, if I use this in each function validate input
will it slow things down a lot?

just curious how you might handle this type of situation (other than
not validating at all).
For a start, don't raise strings as exceptions: only use instances of
Exception.

Now consider what your first test does: it throws an error if you pass in
an empty string. Perhaps you do want to check for that, in which case you
will need to test for it and throw an appropriate exception.

The first test also catches values such as None, 0 or []. Do you really
want to throw a different exception for sayHello(0) and sayHello(1)? It
seems a bit pointless, so the first test should just check against an empty
string and not against other false objects which would get caught by the
second test.

Now for the second test. It would probably be useful to say in the
exception which type was involved, not just that it wasn't a string.
An appropriate exception for these would be something like:

TypeError: cannot concatenate 'str' and 'int' objects

since that tells you both the types and the operation that failed. Delete
that second test altogether and you'll get an appropriate exception instead
of a string which hides all the information.

A good rule is if you want to hide exception information from the user do
it when displaying the exception not when raising it. That way you can get
at all the exception information available by changing one place in the
code instead of having to do it everywhere.

So your modified function should look like:

def sayHello(name):
if name=="":
raise ValueError("name can't be blank")
print "Hello "+name

(this is slightly different than your original in a few other ways: it will
accept unicode strings so long as they can be encoded in ascii, and its a
function as there isn't much point having a method which doesn't use self.)
 
A

abcd

The "Python way" is to validate by performing the operations you need to
perform and catching any exceptions that result. In the case of your
example, you seem to be saying that you'd rather raise your own
exception (which, by the way, should really be a subclass of Exception,
but we will overlook that) that relying on the interpreter to raise a
ValueError or a TypeError. Is there really any advantage to this? You
increase your code size and add *something* to execution time with
little real purpose.

People coming to Python after C++ or some similar language that allows
or requires parameter type declarations often don't feel comfortable
taking this direction to start with, but it works well for most of us.

regards
Steve


So instead of validating input I should just try and use the input as
if it was correct, and let python throw the errors?
 
N

Neil Cerutti

In my code I am debating whether or not to validate the types of data
being passed to my functions. For example

def sayHello(self, name):
if not name:
rasie "name can't be null"
if not isinstance(name, str):
raise "name must be a string"
print "Hello " + name

Is the use of isinstance a "bad" way of doing things? is it a
"heavy" operation? for example, if I use this in each function
validate input will it slow things down a lot?

just curious how you might handle this type of situation (other
than not validating at all).

Validation of parameters is an excellent idea, but *not*
validation of datatypes. The problem is that sayHello can
function properly with many more objects than just strings, if
you write it differently. The following version accepts any
iterable over strings.

def sayHello(self, name):
it = iter(name)
print "Hello", ''.join(it)

It still lacks validation. But to validate a name you will need
to conceive a set of regular strings that contains every name
you'd like to accept. Names probably aren't worth validating,
although you might reasonably reject a few things, like the empty
string.
 
B

Bruno Desthuilliers

abcd a écrit :
In my code I am debating whether or not to validate the types of data
being passed to my functions. For example

def sayHello(self, name):
if not name:
rasie "name can't be null"
if not isinstance(name, str):
raise "name must be a string"
print "Hello " + name

Is the use of isinstance a "bad" way of doing things?

Mostly, yes. Python is dynamically typed (well, it's dynamic all the
way...), and fighting against the language is a bad idea.

Also, since the use of an object of non-compatible type would usually
raise an exception (and while we're at it, avoid using strings as
exceptions - better to use some Exception class), you don't actually
gain anything.
just curious how you might handle this type of situation (other than
not validating at all).

There are mostly 2 cases :
1/ you're getting data from the outside world. Here, you have to be
*very* careful, and you usually need more than simple validation. Good
news is that we have modules like FormEncode designed to handle this case.

2/ you're getting 'data' from within your Python program. If you
correctly applied 1/, whatever comes in should be ok - that is, unless
you have a programmer error !-). But then, you'll usually have a nice
exception and traceback (or better unit tests failures), so you can fix
the problem immediatly.

Now there are *a few* corner cases where it makes sens to check what has
been passed to a function - either because there are very strict and
stable requirements here, or because the function can accept different
kinds of objects, but needs to handle them in distinct ways.

MVHO is that the less code the better. As a matter of fact, trying to
'protect' your function, you introduced a syntax error, that would not
have been here if you had just wrote the simplest thing:

def say_hello(who):
print "Hello", who

My 2 cents...
 
S

Steve Holden

abcd said:
So instead of validating input I should just try and use the input as
if it was correct, and let python throw the errors?
Yes. This approach is often referred to as BTAFTP (better to ask
forgiveness than permission), as opposed to the LBYL (look before you
leap) approach that your original email inquired about.

regards
Steve
 
A

abcd

Well my example function was simply taking a string and printing, but
most of my cases would be expecting a list, dictionary or some other
custom object. Still propose not to validate the type of data being
passed in?

Thanks.
 
M

Marc 'BlackJack' Rintsch

Well my example function was simply taking a string and printing, but
most of my cases would be expecting a list, dictionary or some other
custom object. Still propose not to validate the type of data being
passed in?

Yes because usually you don't expect a list or dictionary but some object
that *acts* like a list or dictionary. Or you even expect just some
aspects of the type's behavior. For example that it is something you can
iterate over.

Ciao,
Marc 'BlackJack' Rintsch
 
A

abcd

Well my example function was simply taking a string and printing, but
most of my cases would be expecting a list, dictionary or some other
custom object. Still propose not to validate the type of data being
passed in?

Thanks.
 
A

abcd

Yes because usually you don't expect a list or dictionary but some object
that *acts* like a list or dictionary. Or you even expect just some
aspects of the type's behavior. For example that it is something you can
iterate over.

Ciao,
Marc 'BlackJack' Rintsch

good point. is there place that documents what methods/attrs I should
check for on an object? for example, if its a list that I expect I
should verify the object that is passed in has a ??? function? etc.

thanks.
 
D

Diez B. Roggisch

abcd said:
good point. is there place that documents what methods/attrs I should
check for on an object? for example, if its a list that I expect I
should verify the object that is passed in has a ??? function? etc.

Don't check, try. Catch a possible exception, and continue with another type
assumption. The only thing one often checks is for basestring, as
basestring supports iteration, but more than often isn't supposed to be
iterated over.

Small example to gather all strings out of a tree of objects (untested):

def foo(arg):
# string case
if isinstance(arg, basestring):
return [arg]
# dict-like
try:
res = []
for value in arg.itervalues():
res.extend(foo(value))
return res
except AttributeError:
pass
# generally iterables
res = []
for value in arg:
res.extend(foo(value))
return res

Diez
 
G

Gabriel Genellina

good point. is there place that documents what methods/attrs I should
check for on an object? for example, if its a list that I expect I
should verify the object that is passed in has a ??? function? etc.

Don't insist on checking! :)
Just try to use the object - you'll get an exception at first invalid usage.

By example, a lot of functions take a file parameter to output
something on it. Usually the *only* method called is write(). So any
object with a write() method (taking a single string argument) would
be fine; StringIO are an example. Checking if the argument is an
instance of the file type would make that impossible.
Anyway, sometimes it's ok to check in advance - but please consider
to check the *behavior* you expect, not the exact instance type. In
the example above, you can validate that fileobject has a write
attribute: getattr(fileobject, "write"). But I'd only do that if I
have a good reason (perhaps if the file is used after some lengthy
calculation,and I want to be sure that I will be able to store the result)


--
Gabriel Genellina
Softlab SRL






__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
 
M

Matthew Woodcraft

abcd said:
Well my example function was simply taking a string and printing, but
most of my cases would be expecting a list, dictionary or some other
custom object. Still propose not to validate the type of data being
passed in?


There are many people here who will indeed suggest that you're still
best off not validating.

There are various points to consider:

- Not adding the validation code saves a certain amount of effort.

- Not adding the validation code avoids one source of possible bugs.

- Not adding the validation code can make your code more readable, in
that there's that much less uninteresting code for your readers to
skip before they get to the meat.

- Adding the validation code can make your code more readable, in that
it can be clearer to the readers what kind of values are being
handled.

- If you validate, you can raise an exception from the start of your
function with a fairly explicit message. If you don't validate,
you're likely to end up with an exception whose message is something
like 'iteration over non-sequence', and it might be raised from some
function nested several levels deeper in.

The latter can be harder for the user of your function to debug (in
particular, it may not be easy to see that the problem was an invalid
parameter to your function rather than a bug in your function itself,
or corrupt data elsewhere in the system).

- If you don't validate, your function will accept anything that
behaves sufficiently like a list/dictionary/custom-object for its
purposes.

You may consider this an advantage or a disadvantage. To some extent
it depends on the circumstances in which the function is used: if
someone passes a not-quite-a-file (say) to a function expecting a
file, is it more likely that this is because of a subtle bug that
they'll be pleased to learn about early, or that they wanted the
function to 'do the obvious thing' with it?

- In particular, suppose your function expects a list and someone
passes a string when they should have passed a list containing only
that string. If you don't validate, the function is likely to process
the string the same way as it would process a list containing a
number of single-character strings.

This might well lead to your program apparently completing
successfully but giving the wrong result (which is usually the kind
of error you most want to avoid).

- If the function is going to be maintained together with its callers
(rather than being part of the public interface to a library, say),
then validation code is less likely to get in the way, because it
should be easy to relax the checks if that turns out to be
convenient.


-M-
 
D

Duncan Booth

Gabriel Genellina said:
In
the example above, you can validate that fileobject has a write
attribute: getattr(fileobject, "write"). But I'd only do that if I
have a good reason (perhaps if the file is used after some lengthy
calculation,and I want to be sure that I will be able to store the
result)

Or even just:

write = fileobject.write
data = ... lengthy calculation here ...
write(data)

There is no point using getattr when you know the name of the attribute in
advance.
 
B

Bruno Desthuilliers

Matthew Woodcraft a écrit :
There are many people here who will indeed suggest that you're still
best off not validating.

There are various points to consider:

- Not adding the validation code saves a certain amount of effort.
Yes

- Not adding the validation code avoids one source of possible bugs.
Yes

- Not adding the validation code can make your code more readable, in
that there's that much less uninteresting code for your readers to
skip before they get to the meat.
Yes

- Adding the validation code can make your code more readable, in that
it can be clearer to the readers what kind of values are being
handled.

This is better expressed in the docstring. And if it's in the docstring,
you can't be blamed for misuse.
- If you validate, you can raise an exception from the start of your
function with a fairly explicit message. If you don't validate,
you're likely to end up with an exception whose message is something
like 'iteration over non-sequence', and it might be raised from some
function nested several levels deeper in.

And what is the stack backtrace for, actually ?
The latter can be harder for the user of your function to debug (in
particular, it may not be easy to see that the problem was an invalid
parameter to your function rather than a bug in your function itself,
or corrupt data elsewhere in the system).

docstrings and unit-tests should make it clear.
- If you don't validate, your function will accept anything that
behaves sufficiently like a list/dictionary/custom-object for its
purposes.
Yes

You may consider this an advantage or a disadvantage. To some extent
it depends on the circumstances in which the function is used: if
someone passes a not-quite-a-file (say) to a function expecting a
file, is it more likely that this is because of a subtle bug that
they'll be pleased to learn about early, or that they wanted the
function to 'do the obvious thing' with it?

Python's POV on this is quite clear IMHO. Now if one want to have to
declare everything three times and write layers and layers of adapters
and wrappers, well, he knows where to find Java !-)
- In particular, suppose your function expects a list and someone
passes a string when they should have passed a list containing only
that string. If you don't validate, the function is likely to process
the string the same way as it would process a list containing a
number of single-character strings.

Yes. This is a very common Python gotcha. And one that is usually quite
easy to spot and fix, even manually (let's not talk about unit-tests).
This might well lead to your program apparently completing
successfully but giving the wrong result (which is usually the kind
of error you most want to avoid).

Compared to what C or C++ can do to your system, this is still a pretty
minor bug - and probably one of the most likely to be detected very
early (did I talk about unit tests ?).
 
B

Bruno Desthuilliers

abcd a écrit :
Well my example function was simply taking a string and printing, but
most of my cases would be expecting a list, dictionary or some other
custom object. Still propose not to validate the type of data being
passed in?

Yes - unless you have a *very* compelling reason to do otherwise.
 
M

Matthew Woodcraft

Bruno Desthuilliers said:
Matthew Woodcraft a écrit :
This is better expressed in the docstring. And if it's in the
docstring, you can't be blamed for misuse.

I certainly agree that the description of the function's requirements on
its parameters is best placed in the docstring.

This is another place where the "don't validate, just try running the
code anyway" approach can cause problems: what should you put in the
docstring?

I don't think anyone would like to be fully explicit about the
requirements: you'd end up having to write things like "A string, or at
least anything that's iterable and hashable and whose elements are
single character strings, or at least objects which have an upper()
method which ...".

So in practice you end up writing "a string", and leave the rest of the
'contract' implicit. But that can lead to difficulties if people working
on the code have different ideas of what that implicit contract is -- is
it "a string, or anything else which works with the current
implementation", or perhaps "you may pass something other than a string
so long as you take responsibility for making it support all the
necessary operations, even if the implementation changes", or is there
some project-wide convention about how much like a string such things
have to be?

I think this kind of vagueness can work well within a lump of code which
is maintained as a piece, but it's good to divide up programs into
components with more carefully documented interfaces. And it's at that
level that I think doing explicit parameter validation can be helpful.

And what is the stack backtrace for, actually ?

I'm not sure that you intended that as a serious question, but I'll
answer it anyway.

In an ideal world, the stack backtrace is there to help me work with
code that I'm maintaining. It isn't there to help me grub around in the
source of someone else's code which is giving me an unhelpful error
message. Just as, in an ideal world, I should be able to determine how
to correctly use someone else's code by reading its documentation rather
than its source.

I think this is a 'quality of implementation' issue. When you start
using Python you pretty rapidly pick up the idea that a message like
'len() of unsized object' from (say) a standard library function
probably just means that you didn't pass the value you intended to; but
that doesn't mean it's a good error message. These things do add up to
make the daily business of programming less efficient.

docstrings and unit-tests should make it clear.

I don't see that either of those things remove the issues I described.

Now if one want to have to declare everything three times and write
layers and layers of adapters and wrappers, well, he knows where to
find Java !-)

Right. But using Python there is a position between 'writing layers and
layers of adapters and wrappers' and 'never validate anything': put
explicit checks in particular functions where they're likely to do most
good.

For example, it's often helpful to explicitly validate if you're going
to store the parameters away and do the actual work with them later on.
Consider what happens if you pass garbage to urllib2.install_opener():
you'll get an obscure error message later on from a urlopen() call,
which will be rather less convenient to investigate than an error from
install_opener() would have been.

Compared to what C or C++ can do to your system, this is still a
pretty minor bug - and probably one of the most likely to be detected
very early

I disagree. What C or C++ will do, very often, is produce a segmentation
fault. That may well turn out to be hard to debug, but it's considerably
more likely to be detected early than a successful exit status with
incorrect output.

-M-
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,677
Members
48,796
Latest member
Greg L.

Latest Threads

Top