Python Mystery Theatre -- Episode 2: Así Fue

Raymond Hettinger · Jul 14, 2003

Here are four more mini-mysteries for your amusement
and edification.

In this episode, the program output is not shown.
Your goal is to predict the output and, if anything
mysterious occurs, then explain what happened
(again, in blindingly obvious terms).

There's extra credit for giving a design insight as to
why things are as they are.

Try to solve these without looking at the other posts.
Let me know if you learned something new along the way.

To challenge the those who thought the last episode
was too easy, I've included one undocumented wrinkle
known only to those who have read the code.

Enjoy,

Raymond Hettinger

ACT I -----------------------------------------------
print '*%*r*' % (10, 'guido')
print '*%.*f*' % ((42,) * 2)

ACT II -----------------------------------------------
s = '0100'
print int(s)
for b in (16, 10, 8, 2, 0, -909, -1000, None):
print b, int(s, b)

ACT III ----------------------------------------------------
def once(x): return x
def twice(x): return 2*x
def thrice(x): return 3*x
funcs = [once, twice, thrice]

flim = [lambda x:funcs[0](x), lambda x:funcs[1](x), lambda x:funcs[2](x)]
flam = [lambda x:f(x) for f in funcs]

print flim[0](1), flim[1](1), flim[2](1)
print flam[0](1), flam[1](1), flam[2](1)

ACT IV ----------------------------------------------------
import os
os.environ['one'] = 'Now there are'
os.putenv('two', 'three')
print os.getenv('one'), os.getenv('two')

Jack Diederich · Jul 14, 2003

On Mon, Jul 14, 2003 at 05:42:13AM +0000, Raymond Hettinger wrote:
I didn't look at the code, excepting int() [which is a big exception]

ACT I -----------------------------------------------
print '*%*r*' % (10, 'guido')
print '*%.*f*' % ((42,) * 2)

I'll assume '%r' is __repr__ since '%s' is __str__
The star after a % means "placeholder for a number, get it from the arg list"
so my guess is:
* 'guido'* # padded to ten places, using spaces
*42.000000000000000000000000* # 42 decimal points

ACT II -----------------------------------------------
s = '0100'
print int(s)
for b in (16, 10, 8, 2, 0, -909, -1000, None):
print b, int(s, b)

int(0100) == 64 # octal
I looked up int_new() in intobject.c because I had never used the optional
base parameter. the 'b' parameter is only legal for 1 >= base <= 36
but the magic constant[1] is -909, which is interpreted as base 10

ACT III ----------------------------------------------------
def once(x): return x
def twice(x): return 2*x
def thrice(x): return 3*x
funcs = [once, twice, thrice]

flim = [lambda x:funcs[0](x), lambda x:funcs[1](x), lambda x:funcs[2](x)]
flam = [lambda x:f(x) for f in funcs]

print flim[0](1), flim[1](1), flim[2](1)
print flam[0](1), flam[1](1), flam[2](1)

funcs, flim, and flam all seem identical to me. all these should print
1 2 3

ACT IV ----------------------------------------------------
import os
os.environ['one'] = 'Now there are'
os.putenv('two', 'three')
print os.getenv('one'), os.getenv('two')

no idea, so I'll punt
Now there are three

-jack

[1] When optional arguments are omitted (in this case 'base') the C variable
where they are recorded is left unchanged. In this case that variable starts
at -909 so if it is passed in as -909 or omitted the code doesn't know.
But why can't base just default to 10 in the first place? If the result
is -909 (omitted or passed in as -909) we just do a base-10 conversion anyway.

intobject.c

PyObject *x = NULL;
int base = -909;
static char *kwlist[] = {"x", "base", 0};

if (type != &PyInt_Type)
return int_subtype_new(type, args, kwds); /* Wimp out */
if (!PyArg_ParseTupleAndKeywords(args, kwds, "|Oi:int", kwlist, &x, &base))
return NULL;
if (base == -909)
return PyNumber_Int(x); /* This will do a base-10 conversion anyway !!! */
if (PyString_Check(x))
return PyInt_FromString(PyString_AS_STRING(x), NULL, base);

The only exception is that the first argument has to be a string if the
optional base argument is used. I'm sure there was a reason for this ... ?
int('99', 10) # legal
int(99, 10) # silly, but should this really be illegal?

Helmut Jarausch · Jul 14, 2003

Raymond said:
ACT III ----------------------------------------------------
def once(x): return x
def twice(x): return 2*x
def thrice(x): return 3*x
funcs = [once, twice, thrice]

flim = [lambda x:funcs[0](x), lambda x:funcs[1](x), lambda x:funcs[2](x)]
flam = [lambda x:f(x) for f in funcs]

print flim[0](1), flim[1](1), flim[2](1)
print flam[0](1), flam[1](1), flam[2](1)

OK, I believe to know why the last line
print '3' three times, since only a reference
to 'f' is stored within the lambda expression
and this has the value 'thrice' when 'print'
is executed.
But how can I achieve something like an
evaluation of one indirection so that
a reference to the function referenced by 'f'
is stored instead.

Thanks for hint.
(this references only model of Python is
a bit hard sometimes)

--
Helmut Jarausch

Lehrstuhl fuer Numerische Mathematik
RWTH - Aachen University
D 52056 Aachen, Germany

Jack Diederich · Jul 14, 2003

Raymond said:
Raymond said:

ACT III ----------------------------------------------------
def once(x): return x
def twice(x): return 2*x
def thrice(x): return 3*x
funcs = [once, twice, thrice]

flim = [lambda x:funcs[0](x), lambda x:funcs[1](x), lambda x:funcs[2](x)]
flam = [lambda x:f(x) for f in funcs]

print flim[0](1), flim[1](1), flim[2](1)
print flam[0](1), flam[1](1), flam[2](1)

Click to expand...

OK, I believe to know why the last line
print '3' three times, since only a reference
to 'f' is stored within the lambda expression
and this has the value 'thrice' when 'print'
is executed.
But how can I achieve something like an
evaluation of one indirection so that
a reference to the function referenced by 'f'
is stored instead.

The problem is made up to make a point, just doing
flam = funcs
will do the right thing. If you really want to wrap the function in a list
comp, you could do

def wrap_func(func):
return lambda x:func(x)

flam = [wrap_func(f) for (f) in funcs] # wrap during a list comp
flam = map(wrap_func, funcs) # the map() equivilent

more awkward versions of the above:

wrap_func = lambda func:lambda x:func(x)
flam = [(lambda func:lambda x:func(x))(f) for f in funcs]

-jack

Jason Trowbridge · Jul 14, 2003

I didn't look at docs or try out the code until after trying to solve
the problem. I'm using Python 2.2.1. I did not solve Act I or Act
III, and tried them out directly.

Act I
I didn't know that python formatting could do that! I've always
treated it like C's printf-style of statements, as that seems to be
what it's primarily based off. I've always used two string
substitutions to first replace the formatting parts, then actually
insert the real substitutions!

Eg: 0.500000
Now to find I can do: 0.500000

In some respects, moving from C/C++ to Python is a bit like moving
from Linux to Mac OS X. I use the basic screwdriver, since I know how
and where it is, and don't see the nifty cordless power screwdriver
placed nicely in the cabinet.

Luckily, I have browsed through the entire module index at least once,
so I don't miss the jackhammers and use a trowel instead.

Act II
Again, there's behavior here that I didn't expect. I first assumed
that the results would be:

print int('0100') -> 100 (Correct)
print 16, int('0100', 16) -> 16 256 (Correct)
print 10, int('0100', 10) -> 10 100 (Correct)
print 8, int('0100', 8) -> 8 64 (Correct)
print 2, int('0100', 2) -> 2 4 (Correct)
print 0, int('0100', 0) -> ?
print -909, int('0100', -909) -> ?
print -1000, int('0100', -1000) -> ?
print None, int('0100', None) -> None 100 (Wrong, TypeError occurs)

The interesting thing is when I tried it out:Traceback (most recent call last):
File "<stdin>", line 1, in ?
ValueError: int() base must be >= 2 and <= 36

I am using Python 2.2.1. According to the doc for int(x, radix), the
part about the radix behaviour is as follows:

The radix parameter gives the base for the conversion and may be any
integer in the range [2, 36], or zero. If radix is zero, the proper
radix is guessed based on the contents of string; the interpretation
is the same as for integer literals.

That explains the 0 radix and the exception caused by the -1000 radix.
So why does it work with a radix of -909? I presume a bug (which
probably got fixed in later versions of Python). I'll have to see if
this behavior is present under Python 2.3b at home.

Act III
Ick. Lambda's. Skipping for now.

Act IV
I would guess that it would print:
'Now there are three'
Since the environmental variable one is 'Now there are', and the
environmental variable two is 'three'.

My bad. Upon running, I get:
'Now there are None'

Apparently, os.putenv() doesn't work like I thought.

Ah! os.putenv() updates the environment, but not the os.environ
dictionary. It looks like os.getenv() retrieves the environmental
variables from os.environ, and just assumes that it is up to date.
Since it defaults to None if the environmental variable doesn't exist
in os.environ[], that's what I get.

Hmm, so os.getenv() and os.putenv() are not symmetric. This isn't
mentioned in the Python 2.2.1 documentation, as os.getenv() is listed
as:

Return the value of the environment variable varname if it exists, or
value if it doesn't. value defaults to None.

This misleads that it is getting the value from the actual
environment, not the os.environ[] variable. Nasty and subtle, too!

Act III

Ick. Lambda's. Oh well, here's a stab at it.

funcs is a list of functions.
flim is a list of unnamed functions that call the functions in funcs.
flam is a list comprehension of lambda's that call the functions in
funcs.

flim and flam should be functionally equivalent (2/3 pun intended).

The output should be:
1 2 3
1 2 3

Since they are just calling the functions listed in funcs (once,
twice, thrice).

Hmm, the output is really:
1 2 3
3 3 3

That's odd. Why is this the result here?

print [ f.__name__ for f in funcs]

Click to expand...

Click to expand...

['once', 'twice', 'thrice']

So, f is updating correctly to the next value.

for test in [ lambda x: f.__name__ for f in funcs]: print

Click to expand...

Click to expand...

test(1), id(test)
....
thrice 135971068
thrice 136291772
thrice 135757396

Okay, so the lambda's being created are unique, yet are being mapped
to the third function.

def fourth(x): return 4*x ....
f

Click to expand...

f = fourth
f

Click to expand...

flam[0](1)

Click to expand...

Click to expand...

4

Aha! So the lambda is looking up f in the current scope when it is
executed! Instead of binding to the actual function object being
iterated over in the list comprehension, the lambda is binding to the
variable 'f' itself?

Ick! Ick! Ick! Bad touch!

(Hey, these are fun!)

_ () () Jason Trowbridge | "... but his last footer says 'page
( ' .~. Generic Programmer | 3 of 2', which leads me to believe
\ = o = | something is wrong."
---"`-`-'"---+ (e-mail address removed) | --Scott Bucholtz

Fredrik Lundh · Jul 14, 2003

Helmut said:
OK, I believe to know why the last line
print '3' three times, since only a reference
to 'f' is stored within the lambda expression
and this has the value 'thrice' when 'print'
is executed.

But how can I achieve something like an
evaluation of one indirection so that
a reference to the function referenced by 'f'
is stored instead.

assuming you meant "the function reference by 'f' when the lambda
is created", the easiest solution is to use default argument binding:

flam = [lambda x,f=f: f(x) for f in funcs]

the "f=f" construct will bind the inner name "f" to the current value of
the outer "f" for each lambda.

the nested scopes mechanism is often introduced as the "right way" to
do what was done with argument binding in earlier versions of Python.
however, nested scopes bind *names*, while argument binding binds
*values*.

</F>

Bernhard Herzog · Jul 15, 2003

Raymond Hettinger said:
[Jason Trowbridge]

Act I
I didn't know that python formatting could do that! I've always
treated it like C's printf-style of statements, as that seems to be
what it's primarily based off.

Click to expand...

That's why this one was included.
Hope everyone learned something new.

C's printf can do this too. At least the one in the GNU libc can. It's
docs don't say anything about this being a GNU extension so I guess it
can be found in other libcs as well, though probably not in all.

Bernhard

Duncan Booth · Jul 15, 2003

Obviously Python allows references to references, since
e.g. 'once' (the 'name' of a function) is a reference to
the code and 'f' is a reference to that reference. (you call it
name binding)

There is no 'obviously' about it.

'once' is a name bound to the function.
'f' is another name bound to the same function.
There are no references to references here.

It is true that the function knows that its name is 'once', and indeed the
code object used by the function also has a name 'once', but:

def once(x): return x
f = once

Both 'f' and 'once' are names bound directly to the same function object.

+------+ +----------------+
| once |------------->| function object|
+------+ +----------------+
^
+------+ |
| f |----------------+
+------+

Assignment in Python simply makes a new binding to the existing object. It
doesn't matter what type the existing object was, it never makes a copy of
the object nor adds an extra level of indirection.

A similar situation arises in Maple and there one has the choice
to either derefence all references down to the real object
or to just derefence a single time.

Example

def once(x): return x
def twice(x): return 2*x
ref= once
def caller():
callee=ref # (*)
print callee(1)

caller() # prints 1
ref= twice
caller() # prints 2 so that demonstrates name binding

how can I get the current value (like 'xdef' in TeX)
of 'ref' in the assignment (*) above, so that
'callee' becomes an (immutable) reference to 'once' ?

You did get the current value of 'ref' so that the first time callee was
bound to the same function that 'once' and 'ref' were bound to, and the
second time the local variable 'callee' was bound to the same function that
'twice' and 'ref' were bound to at that time.

Each time you call 'caller' you get a new local variable, none of the
values are preserved from the previous call. If you want to preserve state,
save an attribute in a global, or better a class instance.

Chris Reedy · Jul 15, 2003

Raymond said:
Here are four more mini-mysteries for your amusement
and edification.

In this episode, the program output is not shown.
Your goal is to predict the output and, if anything
mysterious occurs, then explain what happened
(again, in blindingly obvious terms).

There's extra credit for giving a design insight as to
why things are as they are.

Try to solve these without looking at the other posts.
Let me know if you learned something new along the way.

To challenge the those who thought the last episode
was too easy, I've included one undocumented wrinkle
known only to those who have read the code.

I thought this one was much tougher than the Act 1. I ended up doing a
lot of research on this one. I haven't read the other answers yet, I've
been holding off until I finished this. (Having read my response, I
apologize for the length. I don't think I scored so well on "blindingly
obvious".) Here goes ...

ACT I -----------------------------------------------
print '*%*r*' % (10, 'guido')
print '*%.*f*' % ((42,) * 2)

This one wasn't hard. I've used this feature before. The stars at the
front and back tend to act as visual confusion. The stars in the middle
indicate an option to the format that is provided as a parameter. Thus
the first one prints the representation (%r) of the string 'guido' as a
ten character wide field. When I tried it, the only thing I missed was
that the representation of 'guido' is "'guido'" not "guido". So the
first one prints out:

* 'guido'*

rather than:

* guido*

which would have been my first guess.

The second one takes just a little more thought. The result of this is
equivalent to:

print '*%.42f*' % 42

which yields

*42.000000000000000000000000000000000000000000*

That is a fixed point number with 42 digits after the decimal point.
(Yes, I did copy that from Idle rather than counting zeros.)

Aside: I have to admit that the ((42,) * 2) did confuse me at first. I'm
so used to doing 2 * (42,) when I want to repeat a sequence that I
hadn't thought about the reversed form.

Having used this feature before, I have to say that I think the
documentation for how to do this is quite comprehensible.

ACT II -----------------------------------------------
s = '0100'
print int(s)
for b in (16, 10, 8, 2, 0, -909, -1000, None):
print b, int(s, b)

Boy! This one send me to the documentation, and finally to the code.

According to the documentation the legal values for the parameter b are
b = 0 or 2 <= b <= 36. So the first print yields 100 (the default base
for a string is 10 if not specified). The next few lines of output are:

16 256
10 100
8 64
2 4
0 64

The only one that deserves an additional comment is the last line.
According to the documentation, a base of 0 means that the number is
interpreted as if it appeared in program text, in this case, since the
string begins with a '0', its interpreted as base 8.

Let's skip -909 for a moment. -1000 raises an exception. None would also
raise an exception if we ever got there. I also find that one a little
non-intuitive, more about that later.

For no immediately apparent reason (Raymond's undocumented wrinkle!),
the next line of the output (after the above) is:

-909 100

The only reason I found that was to try it. After hunting through the
code (Yes, I have no problem with C. No, I'm not familiar with the
organization of the Python source.) I eventually (see int_new in
intobject.c) find out that the int function (actually new for the int
type) looks like it was defined as:

def int(x, b=-909):
...

That is, the default value for b is -909. So, int('0100', -909) has the
same behavior as int('0100'). This explains the result.

Having read the code, I now understand _all_ about how this function
works. I understand why there is a default value. For example:

int(100L) yields 100, but there is no documented value for b such that
int(100L, b) yields anything except a TypeError. However, using b=-909
is the same as not specifying b. This allows me to write code like:

if type(x) is str:
b = 16
else:
b = -909
return int(x, b)

I'm not really sure whether that's better than, for example

if type(x) is str:
return int(x, 16)
else:
return int(x)

or not. However, I find the use of the constant -909 is definitely
"magic". If it was up to me, I would use a default value of b = None, so
that int(x) and int(x, None) are equivalent. It seems to me that that
could be documented and would not be subject to misinterpretation.

ACT III ----------------------------------------------------
def once(x): return x
def twice(x): return 2*x
def thrice(x): return 3*x
funcs = [once, twice, thrice]

flim = [lambda x:funcs[0](x), lambda x:funcs[1](x), lambda x:funcs[2](x)]
flam = [lambda x:f(x) for f in funcs]

print flim[0](1), flim[1](1), flim[2](1)
print flam[0](1), flam[1](1), flam[2](1)

This one was ugly. I guessed the right answer but then had to do some
more research to understand exactly what was going wrong.

The first line prints 1, 2, 3 just like you expect.

First reaction, the second line also prints 1, 2, 3. But, Raymond
wouldn't have asked the question if it was that easy. So, guessing that
something funny happens I guessed 3, 3, 3. I tried it. Good guessing.

Now why?

After a bunch of screwing around (including wondering about the details
of how the interpreter implements lambda expressions). At one point I
tried the following (in Idle):

for f in flam: print f(1)

And wondered why I got an exception for exceeding the maximum recursion
limit. What I finally realized was that the definition of flam
repeatedly binds the variable f to each of the functions in funcs. The
lambda expression defines a function that calls the function referenced
by f. At the end of the execution of that statement, f is thrice, so all
three of the defined lambdas call thrice. That also explains why I hit
the maximum recursion limit.

At this point I felt like I had egg on my face. I've been burned by this
one in the past, and I spent a while figuring it out then. The fix is easy:

flam = [lambda x, fn=f: fn(x) for f in funcs]

which creates a new local binding which captures the correct value at
each iteration. This is the kind of problem which makes me wonder
whether we ought to re-think about binding of variables for loops.

ACT IV ----------------------------------------------------
import os
os.environ['one'] = 'Now there are'
os.putenv('two', 'three')
print os.getenv('one'), os.getenv('two')

Obviously, this one is trying to trick you into thinking it will print
'Now there are three'. I ended up trying it and getting 'Now there are
None'. Then I went back and read the documentation. What I got confused
about was that os.putenv updates the external environment without
changing the contents of os.environ. Updating os.environ will change the
external environment as a side effect. I had read about this before but
had gotten the two behaviors reversed in my head.

Now, why is it this way? It makes sense that you may have a use case for
changing the external environment without changing the contents of
os.environ and so need a mechanism for doing so. However, on reflection,
I'm not sure whether I think the implemented mechanism is
counter-intuitive or not.

Aahz · Jul 15, 2003

Aside: I have to admit that the ((42,) * 2) did confuse me at first. I'm
so used to doing 2 * (42,) when I want to repeat a sequence that I
hadn't thought about the reversed form.

My experience is that most people do it the way Ray did, <seq> * <reps>.

Chris Reedy · Jul 15, 2003

John said:
Robin Becker said:

[snip]
>>
I know that it's in microsoft

the print width description contains

[snip]

Click to expand...

It's pretty much bog-standard C. E.g. K&R2, appendix B, section 1.2
says:
[snip]

Just for reference purposes: This behavior is part of the C-standard.
See section 7.19.6.1, "The fprintf function", in ISO/IEC 9899:1999.

Chris Reedy · Jul 15, 2003

Aahz said:
My experience is that most people do it the way Ray did, <seq> * <reps>.

This must be my math background confusing me. Conventionally, a+a is
written as 2a, and a*a is written as a^2 (or a**2). Of course, if you
recognize that concatenation of sequences is really a multiplication (by
convention in mathematics addition is always a commutative operator),
a*2 makes sense. I guess I'll change the way I write this in the future.

Erik Max Francis · Jul 16, 2003

Aahz said:
My experience is that most people do it the way Ray did, <seq> *
<reps>.

I do it Ray's way, too. I'm not really sure why.

Python Mystery Theatre -- Episode 3: Extend this	1	Jul 22, 2003
lambda in list comprehension acting funny	45	Jul 11, 2012
Python code problem	2	Apr 23, 2023
Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Python battle game help	2	Feb 23, 2023
Why is Python telling me variable is local not global?	3	Sep 2, 2023
list 2 dict?	8	Jan 2, 2011
My graphics don't look good with my buttons	8	Feb 28, 2022

Python Mystery Theatre -- Episode 2: Así Fue

Raymond Hettinger

Jack Diederich

Helmut Jarausch

Jack Diederich

Jason Trowbridge

Fredrik Lundh

Bernhard Herzog

Duncan Booth

Chris Reedy

Aahz

Chris Reedy

Chris Reedy

Erik Max Francis

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads