Sentinel values for special cases

B

Ben Finney

Howdy all,

Ned Batchelder blogged[0] about a debate over checking function
parameters, and to what extent dynamic typing should be relied upon.

I was one of many who commented, but I wrote what purports to be a
comprehensive checklist when considering special-case inputs for
functions. I thought the best way to puncture my ego would be to post
that list here and see how well it survives :) Here goes:


If you have a special case, there are a few steps.

First, do you really need a special case, or are you just being
paranoid about type safety? Let the caller take care of whether they
mean what they say, and make your function do *one* job clearly and
simply. (This supports the 'insert_ids(get_global_ids())' idea
earlier.)

Second, do you *really* need a special case, or are you making your
function too complex? Be very suspicious of functions that are written
to do two different things depending on their input, and split them so
that both are simple and the caller can be explicit about what they
want. This doesn't preclude factoring out the code that's common to
both of them, of course.

Third, if you actually need a special case, can it be None? This is
the idiomatic Python "sentinel value", and it looks like the code
posted by 'sri' [reproduced below].

def insert_ids(ids=None):
if ids is None:
ids = get_global_ids()

Note that if you're squeamish about using None,
but don't have a specific reason not to use it, use it; other
programmers will thank you for following convention.

Fourth, if you have decided that a magic sentinel value is called for
but None is already taken for some other purpose, don't use a
string. Use a unique do-nothing object, defined at the module level so
callers can easily get at it, like 'Dmitry Vasiliev' showed
[reproduced below].

GLOBAL = object()

def insert_ids(ids=GLOBAL):
if ids is GLOBAL:
ids = get_global_ids()

You won't accidentally use it, because it's defined only in one place
(you're comparing by 'is', remember) and it's not used for anything
except indicating the special case.

Fifth, there is no fifth. If you've come to the end and think it's too
complex, it probably is. Start at the top again.



[0]: <URL:http://www.nedbatchelder.com/blog/200610.html#e20061022T192641>
 
P

Paddy

Ben said:
Howdy all,

Ned Batchelder blogged[0] about a debate over checking function
parameters, and to what extent dynamic typing should be relied upon.

I was one of many who commented, but I wrote what purports to be a
comprehensive checklist when considering special-case inputs for
functions. I thought the best way to puncture my ego would be to post
that list here and see how well it survives :) Here goes:

[0]: <URL:http://www.nedbatchelder.com/blog/200610.html#e20061022T192641>

Referring to the blog entry, if I had a function that could take
several items of data, where data items were 'expensive' to gather, but
might not all be needed due to other data values, then I can see the
need for using sentinel values for some of the data items and only
computing their true value when necessary.

In Python, as others stated, the sentinel value of choice for all the
data values should be None, or, some module level global constant,
guaranteed not to be part of the data.

None is what Python readers would expect as a sentinel value, but if
any of your data fields could have None as a valid value then you may
have to switch to a module level constant.
Be wary of using sentinel values which are strings, if your data could
itself be a string - make sure the sentinel value is not valid data,
and always use the sentinel name and not its value from then on. it is
very wrong to do this sort of thing:


NO_DATA = '::NO_DATA::'

def xyz(a,b,c):
if a == '::NO_DATA::':
# blah blah blah


You should use the name NO_DATA for the comparison.

If you are having difficulty working out what to use as a sentinel
value for your data then you could declare a Sentinel class and use
that:


class _Sentinel(object):
' Initial data value when true data has not been fetched/computed
yet'
pass

NO_DATA = _Sentinel

def xyz(a,b,c):
if a == NO_DATA:
# go get a


(Hmm, should that be double underscores on _Sentinel ...).

- Paddy.
 
B

Ben Finney

Paddy said:
None is what Python readers would expect as a sentinel value, but if
any of your data fields could have None as a valid value then you may
have to switch to a module level constant.
Be wary of using sentinel values which are strings, if your data could
itself be a string - make sure the sentinel value is not valid data,
and always use the sentinel name and not its value from then on.

I don't think "be wary" is appropriate; I think a blanket "don't use
strings for sentinel values" is fine. In your example below:
it is very wrong to do this sort of thing:

NO_DATA = '::NO_DATA::'

def xyz(a,b,c):
if a == '::NO_DATA::':
# blah blah blah

You should not use a string for that constant at all. Its only purpose
is to be a sentinel value, so just make it a unique object instance:

NO_DATA = object()

def xyz(a,b,c):
if a is NO_DATA:
# blah blah blah

The sentinel value for the 'a' parameter to 'foo.xyz()' is then
'foo.NO_DATA', nothing else. We're now checking by 'is', not '==', so
there's no way to pass the sentinel value to the function unless it's
that specific object. It's impossible to get that particular value any
other way, so you avoid the possibility of accidentally getting the
sentinel as part of the data.

I don't see any advantage a string would have to make it worth using
over the above type of sentinel.
If you are having difficulty working out what to use as a sentinel
value for your data then you could declare a Sentinel class and use
that:

Just use the plain 'object' class, since the instance name will make
its meaning clear, and other instances don't need to share any
behaviour or semantics.
 
A

Aahz

Fourth, if you have decided that a magic sentinel value is called for
but None is already taken for some other purpose, don't use a
string. Use a unique do-nothing object, defined at the module level so
callers can easily get at it, like 'Dmitry Vasiliev' showed
[reproduced below].

GLOBAL = object()

def insert_ids(ids=GLOBAL):
if ids is GLOBAL:
ids = get_global_ids()

The one disadvantage of this approach is that it complicates pickling
if/when you store the stentinel in an instance. There are ways of
working around that, but none are pleasant.
 
G

Gabriel Genellina

The one disadvantage of this approach is that it complicates pickling
if/when you store the stentinel in an instance. There are ways of
working around that, but none are pleasant.

But why should you store the sentinel in an instance? It's only
purpose is to detect a special case in the parameter, when None is
not appropiate.
Even if you were assigning instance attributes, you can use the
sentinel as a default class attribute (which are not pickled).
.... x = GLOBAL
.... def __init__(self, x=GLOBAL):
.... if x is not GLOBAL:
.... self.x = x
....{}



--
Gabriel Genellina
Softlab SRL

__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
¡Abrí tu cuenta ya! - http://correo.yahoo.com.ar
 
B

Ben Finney

Ben Finney said:
Use a unique do-nothing object, defined at the module level so
callers can easily get at it [...]

GLOBAL = object()

def insert_ids(ids=GLOBAL):
if ids is GLOBAL:
ids = get_global_ids()

The one disadvantage of this approach is that it complicates
pickling if/when you store the stentinel in an instance. There are
ways of working around that, but none are pleasant.

Hmm, and any other kind of serialisation would be similarly affected I
suppose.

What's an alternative?
 
B

Ben Finney

Gabriel Genellina said:
But why should you store the sentinel in an instance? It's only
purpose is to detect a special case in the parameter, when None is
not appropiate.

You might be storing values that will later become parameters to the
function.

ids_to_be_processed_later = {
'chin': [14, 7, 9],
'bin': foo.GLOBAL,
'fin': [74, 98, 12],
}
serialise_stuff()
 
G

Gabriel Genellina

But why should you store the sentinel in an instance? It's only
purpose is to detect a special case in the parameter, when None is
not appropiate.

You might be storing values that will later become parameters to the
function.

ids_to_be_processed_later = {
'chin': [14, 7, 9],
'bin': foo.GLOBAL,
'fin': [74, 98, 12],
}
serialise_stuff()

Well... just don't do that! :)
Instead of a dict, use another class instance with a class attribute
(like on my earlier post).


--
Gabriel Genellina
Softlab SRL

__________________________________________________
Correo Yahoo!
Espacio para todos tus mensajes, antivirus y antispam ¡gratis!
¡Abrí tu cuenta ya! - http://correo.yahoo.com.ar
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top