Returning different types based on input parameters

G

George Sakkis

That's more of a general API design question but I'd like to get an
idea if and how things are different in Python context. AFAIK it's
generally considered bad form (or worse) for functions/methods to
return values of different "type" depending on the number, type and/or
values of the passed parameters. I'm using "type" loosely in a duck-
typing sense, not necessarily as a concrete class and its descendants,
although I'm not sure if even duck-typing is endorsed for return
values (as opposed to input parameters).

For example, it is common for a function f(x) to expect x to be simply
iterable, without caring of its exact type. Is it ok though for f to
return a list for some types/values of x, a tuple for others and a
generator for everything else (assuming it's documented), or it should
always return the most general (iterator in this example) ?

To take it further, what if f wants to return different types,
differing even in a duck-type sense? That's easier to illustrate in a
API-extension scenario. Say that there is an existing function `solve
(x)` that returns `Result` instances. Later someone wants to extend f
by allowing an extra optional parameter `foo`, making the signature
`solve(x, foo=None)`. As long as the return value remains backward
compatible, everything's fine. However, what if in the extended case,
solve() has to return some *additional* information apart from
`Result`, say the confidence that the result is correct ? In short,
the extended API would be:

def solve(x, foo=None):
'''
@rtype: `Result` if foo is None; (`Result`, confidence)
otherwise.
'''

Strictly speaking, the extension is backwards compatible; previous
code that used `solve(x)` will still get back `Result`s. The problem
is that in new code you can't tell what `solve(x,y)` returns unless
you know something about `y`. My question is, is this totally
unacceptable and should better be replaced by a new function `solve2
(x, foo=None)` that always returns (`Result`, confidence) tuples, or
it might be a justifiable cost ? Any other API extension approaches
that are applicable to such situations ?

George
 
M

MRAB

George said:
That's more of a general API design question but I'd like to get an
idea if and how things are different in Python context. AFAIK it's
generally considered bad form (or worse) for functions/methods to
return values of different "type" depending on the number, type and/or
values of the passed parameters. I'm using "type" loosely in a duck-
typing sense, not necessarily as a concrete class and its descendants,
although I'm not sure if even duck-typing is endorsed for return
values (as opposed to input parameters).

For example, it is common for a function f(x) to expect x to be simply
iterable, without caring of its exact type. Is it ok though for f to
return a list for some types/values of x, a tuple for others and a
generator for everything else (assuming it's documented), or it should
always return the most general (iterator in this example) ?

To take it further, what if f wants to return different types,
differing even in a duck-type sense? That's easier to illustrate in a
API-extension scenario. Say that there is an existing function `solve
(x)` that returns `Result` instances. Later someone wants to extend f
by allowing an extra optional parameter `foo`, making the signature
`solve(x, foo=None)`. As long as the return value remains backward
compatible, everything's fine. However, what if in the extended case,
solve() has to return some *additional* information apart from
`Result`, say the confidence that the result is correct ? In short,
the extended API would be:

def solve(x, foo=None):
'''
@rtype: `Result` if foo is None; (`Result`, confidence)
otherwise.
'''

Strictly speaking, the extension is backwards compatible; previous
code that used `solve(x)` will still get back `Result`s. The problem
is that in new code you can't tell what `solve(x,y)` returns unless
you know something about `y`. My question is, is this totally
unacceptable and should better be replaced by a new function `solve2
(x, foo=None)` that always returns (`Result`, confidence) tuples, or
it might be a justifiable cost ? Any other API extension approaches
that are applicable to such situations ?
I don't like the sound of this. :)

In your example I would possibly suggest returning a 'Result' object and
then later subclassing to give 'ConfidenceResult' which has the
additional 'confidence' attribute.

I think the only time when it's OK to return instances of different
classes is when one of them is None, for example the re module where
match() returns either a MatchObject (if successful) or None (if
unsuccessful); apart from that, a function should always return an
instance of the same class (or perhaps a subclass) or, if a collection
then the same type of collection (eg always a list and never sometimes a
list, sometimes a tuple).
 
S

Steven D'Aprano

That's more of a general API design question but I'd like to get an idea
if and how things are different in Python context. AFAIK it's generally
considered bad form (or worse) for functions/methods to return values of
different "type" depending on the number, type and/or values of the
passed parameters. I'm using "type" loosely in a duck- typing sense, not
necessarily as a concrete class and its descendants, although I'm not
sure if even duck-typing is endorsed for return values (as opposed to
input parameters).

For example, it is common for a function f(x) to expect x to be simply
iterable, without caring of its exact type. Is it ok though for f to
return a list for some types/values of x, a tuple for others and a
generator for everything else (assuming it's documented), or it should
always return the most general (iterator in this example) ?

Arguably, if the only promise you make is that f() returns an iterable,
then you could return any of list, tuple etc and still meet that promise.
I'd consider that acceptable but eccentric. However, I'd consider it bad
form to *not* warn that the actual type returned is an implementation
detail that may vary.

Alternatively, I'm very fond of what the built-in filter function does:
it tries to match the return type to the input type, so that if you pass
a string as input, it returns a string, and if you pass it a tuple, it
returns a tuple.

To take it further, what if f wants to return different types, differing
even in a duck-type sense? That's easier to illustrate in a
API-extension scenario. Say that there is an existing function `solve
(x)` that returns `Result` instances. Later someone wants to extend f
by allowing an extra optional parameter `foo`, making the signature
`solve(x, foo=None)`. As long as the return value remains backward
compatible, everything's fine. However, what if in the extended case,
solve() has to return some *additional* information apart from `Result`,
say the confidence that the result is correct ? In short, the extended
API would be:

def solve(x, foo=None):
'''
@rtype: `Result` if foo is None; (`Result`, confidence)
otherwise.
'''

Strictly speaking, the extension is backwards compatible; previous code
that used `solve(x)` will still get back `Result`s. The problem is that
in new code you can't tell what `solve(x,y)` returns unless you know
something about `y`. My question is, is this totally unacceptable and
should better be replaced by a new function `solve2 (x, foo=None)` that
always returns (`Result`, confidence) tuples, or it might be a
justifiable cost ? Any other API extension approaches that are
applicable to such situations ?

I dislike that, although I've been tempted to write functions like that
myself. Better, I think, to create a second function, xsolve() which
takes a second argument, and refactor the common parts of solve/xsolve
out into a third private function so you avoid code duplication.
 
G

George Sakkis

In your example I would possibly suggest returning a 'Result' object and
then later subclassing to give 'ConfidenceResult' which has the
additional 'confidence' attribute.

That's indeed one option, but not very appealing if `Result` happens
to be a builtin (e.g. float or list). Technically you can subclass
builtins but I think, in this case at least, the cure is worse than
the disease.

George
 
G

George Sakkis

andrew said:
George said:
That's more of a general API design question but I'd like to get an
idea if and how things are different in Python context. AFAIK it's
generally considered bad form (or worse) for functions/methods to
return values of different "type" depending on the number, type and/or
values of the passed parameters. I'm using "type" loosely in a duck-
typing sense, not necessarily as a concrete class and its descendants,
although I'm not sure if even duck-typing is endorsed for return
values (as opposed to input parameters). [...]

you probably want to look up substitutability:
http://www.google.cl/search?q=substitutability+principle

actually, this is better:http://www.google.cl/search?q=substitution+principle

the idea being that if the "contract" for your function is that it returns
a certain type, then any subclass should also be ok (alternatively, that
subclasses should be written so that they can be returned when a caller
was expecting the superclass)

I'm not sure if Liskov substitution addresses the same problem. The
question here is, what's the scope of the contract ? Does it apply to
the original signature only or any future extended version of it ? In
the former case, the contract is still valid: whenever someone calls
"solve(x)" gets the promised type. The original contract didn't
specify what should the result be when the function is called as "solve
(x, y)" (since the function didn't support a second argument
originally). Only if one interprets the contract as applicable to the
current plus all future extensions, then Liskov substitution comes
into play.

Perhaps that's more obvious in statically typed languages that allow
overloading. IIRC the presence of a method with the signature
float foo(float x);
does not preclude its overloading (in the same or a descendant class)
with a method
char* foo(float x, int y);
The two methods just happen to share the same name but other than that
they are separate, their return value doesn't have to be
substitutable. Is this considered bad practice ?

George
 
A

Adam Olsen

For example, it is common for a function f(x) to expect x to be simply
iterable, without caring of its exact type. Is it ok though for f to
return a list for some types/values of x, a tuple for others and a
generator for everything else (assuming it's documented), or it should
always return the most general (iterator in this example) ?

For list/tuple/iterable the correlation with the argument's type is
purely superficial, *because* they're so compatible. Why should only
tuples and lists get special behaviour? Why shouldn't every other
argument type return a list as well?

A counter example is python 3.0's str/bytes functions. They're
mutually incompatible and there's no default.

To take it further, what if f wants to return different types,
differing even in a duck-type sense? That's easier to illustrate in a
API-extension scenario. Say that there is an existing function `solve
(x)` that returns `Result` instances.  Later someone wants to extend f
by allowing an extra optional parameter `foo`, making the signature
`solve(x, foo=None)`. As long as the return value remains backward
compatible, everything's fine. However, what if in the extended case,
solve() has to return some *additional* information apart from
`Result`, say the confidence that the result is correct ? In short,
the extended API would be:

    def solve(x, foo=None):
        '''
        @rtype: `Result` if foo is None; (`Result`, confidence)
otherwise.
        '''

Strictly speaking, the extension is backwards compatible; previous
code that used `solve(x)` will still get back `Result`s. The problem
is that in new code you can't tell what `solve(x,y)` returns unless
you know something about `y`. My question is, is this totally
unacceptable and should better be replaced by a new function `solve2
(x, foo=None)` that always returns (`Result`, confidence) tuples, or
it might be a justifiable cost ? Any other API extension approaches
that are applicable to such situations ?

At a minimum it's highly undesirable. You lose a lot of readability/
maintainability. solve2/solve_ex is a little ugly, but that's less
overall, so it's the better option.

If your tuple gets to 3 or more I'd start wondering if you should
return a single instance, with the return values as attributes. If
Result is already such a thing I'd look even with a tuple of 2 to see
if that's appropriate.
 
G

George Sakkis

For list/tuple/iterable the correlation with the argument's type is
purely superficial, *because* they're so compatible.  Why should only
tuples and lists get special behaviour?  Why shouldn't every other
argument type return a list as well?

That's easy; because the result might be infinite. In which case you
may ask "why shouldn't every argument type return an iterator then",
and the reason is usually performance; if you already need to store
the whole result sequence (e.g. sorted()), why return just an iterator
to it and force the client to copy it to another list if he needs
anything more than iterating once over it ?
A counter example is python 3.0's str/bytes functions.  They're
mutually incompatible and there's no default.

As already mentioned, another example is filter() that tries to match
the input sequence type and falls back to list if it fails.
At a minimum it's highly undesirable.  You lose a lot of readability/
maintainability.  solve2/solve_ex is a little ugly, but that's less
overall, so it's the better option.

That's my feeling too, at least in a dynamic language. For a static
language that allows overloading, that should be a smaller (or perhaps
no) issue.

George
 
A

Adam Olsen

That's easy; because the result might be infinite. In which case you
may ask "why shouldn't every argument type return an iterator then",
and the reason is usually performance; if you already need to store
the whole result sequence (e.g. sorted()), why return just an iterator
to it and force the client to copy it to another list if he needs
anything more than iterating once over it ?

You've got two different use cases here. sorted() clearly cannot be
infinite, so it might as well always return a list. Other functions
that can be infinite should always return an iterator.

As already mentioned, another example is filter() that tries to match
the input sequence type and falls back to list if it fails.

That's fixed in 3.0. It's always an iterator now.

That's my feeling too, at least in a dynamic language. For a static
language that allows overloading, that should be a smaller (or perhaps
no) issue.

Standard practices may encourage it in a static language, but it's
still fairly confusing. Personally, I consider python's switch to a
different operator for floor division (//) to be a major step forward
over C-like languages.
 
A

Aahz

To take it further, what if f wants to return different types,
differing even in a duck-type sense? That's easier to illustrate in a
API-extension scenario. Say that there is an existing function `solve
(x)` that returns `Result` instances. Later someone wants to extend f
by allowing an extra optional parameter `foo`, making the signature
`solve(x, foo=None)`. As long as the return value remains backward
compatible, everything's fine. However, what if in the extended case,
solve() has to return some *additional* information apart from
`Result`, say the confidence that the result is correct ? In short,
the extended API would be:

def solve(x, foo=None):
'''
@rtype: `Result` if foo is None; (`Result`, confidence)
otherwise.
'''

Strictly speaking, the extension is backwards compatible; previous
code that used `solve(x)` will still get back `Result`s. The problem
is that in new code you can't tell what `solve(x,y)` returns unless
you know something about `y`. My question is, is this totally
unacceptable and should better be replaced by a new function `solve2
(x, foo=None)` that always returns (`Result`, confidence) tuples, or
it might be a justifiable cost ? Any other API extension approaches
that are applicable to such situations ?

For this particular trick, I would always use a unique sentinel value so
that *only* passing an argument would change the result signature:

sentinel = object()
def solve(x, foo=sentinel):
'''
@rtype: `Result` if foo is sentinel; (`Result`, confidence) otherwise.
'''

But I agree with other respondents that this is a code stink.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top