Python Gotcha's?

S

Steven D'Aprano

Greetings,

I'm going to give a "Python Gotcha's" talk at work. If you have an
interesting/common "Gotcha" (warts/dark corners ...) please share.

(Note that I want over http://wiki.python.org/moin/PythonWarts already).


The GIL prevents Python from taking advantage of multiple cores in your
CPU when using multiple threads.

Solution: use a GIL-less Python, like IronPython or Jython, or use
multiple processes instead of threads.



exec() and execfile() are unintuitive if you supply separate dicts for
the globals and locals arguments.

http://bugs.python.org/issue1167300
http://bugs.python.org/issue14049

Note that both of these are flagged as WON'T FIX.

Solution: to emulate top-level code, pass the same dict as globals and
locals.


max() and min() fail with a single argument:
max(2, 3) => 3
max(3) => raises exception

Solution: don't do that. Or pass a list:
max([2, 3]) => 3
max([3]) => 3


Splitting on None and splitting on space is not identical:
"".split() => []
"".split(' ') => ['']



JSON expects double-quote marks, not single:
v = json.loads("{'test':'test'}") fails
v = json.loads('{"test":"test"}') succeeds




If you decorate a function, by default the docstring is lost.

@decorate
def spam(x, y):
"""blah blah blah blah"""

spam.__doc__ => raises exception

Solution: make sure your decorator uses functools.wraps().
 
S

Steven D'Aprano

You mean JSON expects a string with valid JSON? Quelle surprise.

No. The surprise is that there exists a tool invented in the 21st century
that makes a distinction between strings quoted with " and those quoted
with '. Being used to a sensible language like Python, it boggled my
brain the first time I tried to write some JSON and naturally treated the
choice of quote mark as arbitrary. It especially boggled my brain when I
saw the pathetically useless error message generated:

py> json.loads("{'test':'test'}")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.2/json/__init__.py", line 307, in loads
return _default_decoder.decode(s)
File "/usr/local/lib/python3.2/json/decoder.py", line 351, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/local/lib/python3.2/json/decoder.py", line 367, in raw_decode
obj, end = self.scan_once(s, idx)
ValueError: Expecting property name: line 1 column 1 (char 1)

"Expecting property name"??? WTF???


The reason this is a Gotcha rather than a bug is because the JSON
standard specifies the behaviour (probably in order to be compatible with
Javascript). Hence, although the behaviour is mind-numbingly stupid, it
is deliberate and not a bug. Hence, a gotcha.
 
R

Roy Smith

You mean JSON expects a string with valid JSON? Quelle surprise.

No. The surprise is that there exists a tool invented in the 21st century
that makes a distinction between strings quoted with " and those quoted
with '. Being used to a sensible language like Python, it boggled my
brain the first time I tried to write some JSON and naturally treated the
choice of quote mark as arbitrary.[/QUOTE]

Your brain has a low boggle threshold.

There's absolutely no reason why JSON should follow Python syntax rules.
Making it support either kind of quotes would have complicated every
JSON library in the world, for no added value. Nobody should ever be
hand-writing JSON (just like nobody should ever be hand-writing XML).
Just use the supplied library calls and you'll never have to worry about
low-level minutia like this again.
It especially boggled my brain when I
saw the pathetically useless error message generated:

py> json.loads("{'test':'test'}")
[...]
ValueError: Expecting property name: line 1 column 1 (char 1)

"Expecting property name"??? WTF???

One of the hardest things about writing parsers is generating helpful
error messages when things don't parse. But, it's only of value to do
that when you're parsing something you expect to be written by a human,
and thus a human has to puzzle out what they did wrong. Nobody expects
that a JSON parser will be parsing human-written input, so there's
little value to saying anything more than "parse error".
The reason this is a Gotcha rather than a bug is because the JSON
standard specifies the behaviour (probably in order to be compatible with
Javascript).

Well, considering that the JS in JSON stands for JavaScript...
Hence, although the behaviour is mind-numbingly stupid, it
is deliberate and not a bug. Hence, a gotcha.

But surely not a Python gotcha. If anything, it's a JSON gotcha. The
same is true with PHP's JSON library, and Ruby's, and Perl's, and
Scala's, and C++'s, and so on. It's a JSON issue, and a silly one to be
complaining about at that.
 
S

Steve Howell

No. The surprise is that there exists a tool invented in the 21st century
that makes a distinction between strings quoted with " and  those quoted
with '. Being used to a sensible language like Python, it boggled my
brain the first time I tried to write some JSON and naturally treated the
choice of quote mark as arbitrary.

I've been bitten by this gotcha too. Maybe "boggled my brain" would
be a bit of hyperbole, but it did cause me minor pain, and brief but
frustrating pain is the whole point of "gotcha" presentations.
It especially boggled my brain when I
saw the pathetically useless error message generated:

[...]
ValueError: Expecting property name: line 1 column 1 (char 1)

"Expecting property name"??? WTF???

I agree with you that the error message is pretty puzzling. I can
understand the rationale of the parser authors not to go overboard
with diagnosing these errors correctly to users, since it would
complicate the parser code and possibly slow it down even for well
formed JSON. On the other hand, I think that parsers can distinguish
themselves by anticipating the most common gotchas and giving clear
messages.
The reason this is a Gotcha rather than a bug is because the JSON
standard specifies the behaviour (probably in order to be compatible with
Javascript). Hence, although the behaviour is mind-numbingly stupid, it
is deliberate and not a bug. Hence, a gotcha.

Yep.
 
S

Steve Howell

[...] Nobody expects
that a JSON parser will be parsing human-written input, [...]

Humans write JSON all the time. People use JSON as a configuration
language, and some people actually write JSON files by hand. A common
example would be writing package.json for an npm package.

Here are a couple examples:

https://github.com/jashkenas/coffee-script/blob/master/package.json
https://github.com/github/hubot/blob/master/package.json
so there's little value to saying anything more than "parse error".

So, there's little value to say anything more than "parse
error"...except to help all those dumb humans that expect JSON to be
human-writable. ;)
 
I

Iain King

A common one used to be expecting .sort() to return, rather than mutate (as it does). Same with .reverse() - sorted and reversed have this covered, not sure how common a gotcha it is any more.


Iain
 
S

Steve Howell

A common one used to be expecting .sort() to return, rather than mutate (as it does).  Same with .reverse() - sorted and reversed have this covered, not sure how common a gotcha it is any more.

The sort()/sorted() variations are good to cover. To give another
example, folks who had been immersed in legacy versions of Python for
a long time might still be in the habit of hand-writing compare
functions. With newer versions of Python, it usually makes sense to
just use the "key" feature.
 
J

John Gordon

In said:
Greetings,
I'm going to give a "Python Gotcha's" talk at work.
If you have an interesting/common "Gotcha" (warts/dark corners ...)
please share.

This is fairly pedestrian as gotchas go, but it has bitten me:

If you are working with data that is representable as either an integer
or a string, choose one and stick to it. Treating it as both/either will
eventually lead to grief.

Or, in other words: 1 != '1'
 
A

Alain Ketterlin

[...]

The "local variable and scoping" is, imho, something to be really
careful about. Here is an example:

class A(object):
def __init__(self):
self.x = 0
def r(self):
return x # forgot self

a = A()
x = 1
print a.r() # prints 1

I know there is "no remedy". It's just really tricky.

-- Alain.
 
R

Roy Smith

Grzegorz Staniak said:
I think these days it's not just "Python syntax", it's kinda something
that you can get accustommed to take for granted. Realistically, how
much more complication could the support for either quote marks
introduce? I doubt anyone would even notice. And you don't have to
write JSON by hand for this gotcha to bite you, all it takes is to
start playing with generating JSON without the use of specialized
JSON libraries/functions. For testing, for fun, out of curiosity...

If you want to talk a protocol, read the protocol specs and follow them.
Don't just look at a few examples, guess about the rules, and then act
surprised when your guess turns out to be wrong.

If you don't want to take the trouble to read and understand the
protocol specs, use a library written by somebody who has already done
the hard work for you.
 
R

Roy Smith

John Gordon said:
In <7367295.815.1333578860181.JavaMail.geo-discussion-forums@ynpp8> Miki



This is fairly pedestrian as gotchas go, but it has bitten me:

If you are working with data that is representable as either an integer
or a string, choose one and stick to it. Treating it as both/either will
eventually lead to grief.

Or, in other words: 1 != '1'

Tell that to the PHP crowd :)
 
J

Jon Clements

Greetings,

I'm going to give a "Python Gotcha's" talk at work.
If you have an interesting/common "Gotcha" (warts/dark corners ...) please share.

(Note that I want over http://wiki.python.org/moin/PythonWarts already).

Thanks,

One I've had to debug...
print 'found it!'
# Nothing prints as bool(0) is False
print 'found it!'
found it!

Someone new who hasn't read the docs might try this, but then I guess it's not really a gotcha if they haven't bothered doing that.
 
S

Steven D'Aprano

No. The surprise is that there exists a tool invented in the 21st
century that makes a distinction between strings quoted with " and
those quoted with '. Being used to a sensible language like Python, it
boggled my brain the first time I tried to write some JSON and
naturally treated the choice of quote mark as arbitrary.

Your brain has a low boggle threshold.

There's absolutely no reason why JSON should follow Python syntax rules.[/QUOTE]

Except for the most important reason of all: Python's use of alternate
string delimiters is an excellent design, one which Javascript itself
follows.

http://www.javascripter.net/faq/quotesin.htm

I'm not the only one who has had trouble with JSON's poor design choice:

http://stackoverflow.com/a/4612914

For a 21st century programming language or data format to accept only one
type of quotation mark as string delimiter is rather like having a 21st
century automobile with a hand crank to start the engine instead of an
ignition. Even if there's a good reason for it (which I doubt), it's
still surprising.

Making it support either kind of quotes would have complicated every
JSON library in the world, for no added value.

Ooooh, such complication. I wish my software was that complicated.

The added value includes:

* semantic simplicity -- a string is a string, regardless of which
quotes are used for delimiters;

* reducing the number of escaped quotes needed;

* compatibility with Javascript;

* robustness.

As it stands, JSON fails to live up to the Robustness principle and
Postel's law:

Be liberal in what you accept, and conservative in what you send.


http://en.wikipedia.org/wiki/Robustness_principle
Nobody should ever be hand-writing JSON

So you say, but it is a fact that people do. And even if they don't hand-
write it, people do *read* it, and allowing both quotation marks aids
readability:

"\"Help me Obiwan,\" she said, \"You're my only hope!\""

Blah. You can cut the number of escapes needed to one:

'"Help me Obiwan," she said, "You\'re my only hope!"'

(just like nobody should ever be hand-writing XML).
Just use the supplied library calls and you'll never have to worry about
low-level minutia like this again.
It especially boggled my brain when I saw the pathetically useless
error message generated:

py> json.loads("{'test':'test'}")
[...]
ValueError: Expecting property name: line 1 column 1 (char 1)

"Expecting property name"??? WTF???

One of the hardest things about writing parsers is generating helpful
error messages when things don't parse. But, it's only of value to do
that when you're parsing something you expect to be written by a human,

Or when debugging a faulty or buggy generator, or when dealing with non-
conforming or corrupted data. Essentially any time that you expect the
error message will be read by a human being. Which is always.

Error messages are for the human reader, always and without exception. If
you don't expect it to be read by a person, why bother with a message?
 
S

Steven D'Aprano

You mean JSON expects a string with valid JSON? Quelle surprise.

Actually, on further thought, and on reading the JSON RFC, I have decided
that this is a design bug and not merely a gotcha.

The relevant section of the RFC is this:


4. Parsers

A JSON parser transforms a JSON text into another representation. A
JSON parser MUST accept all texts that conform to the JSON grammar.
A JSON parser MAY accept non-JSON forms or extensions.


http://www.ietf.org/rfc/rfc4627.txt


So a valid parser is permitted to accept data which is not strictly JSON.
Given that both Javascript and Python (and I would argue, any sensible
modern language) allows both single and double quotation marks as
delimiters, the JSON parser should do the same. Failure to take advantage
of that is a design flaw.

Of course, the RFC goes on to say that a JSON generator MUST only
generate text which conforms to the JSON grammar. So a conforming
implementation would be perfectly entitled to accept, but not emit,
single-quote delimited strings.
 
A

André Malo

* Steven D'Aprano said:
For a 21st century programming language or data format to accept only one
type of quotation mark as string delimiter is rather like having a 21st
century automobile with a hand crank to start the engine instead of an
ignition. Even if there's a good reason for it (which I doubt), it's
still surprising.

Here's a reason: KISS. Actually I've never understood the reason for
multiple equivalent quote characters. There are languages where these are
not equivalent, like perl, C or shell script. There it makes way more
sense.

(If a parser doesn't accept foreign syntax, that's reasonable enough for me,
too.)

nd
 
R

Roy Smith

Steven D'Aprano said:
I'm not the only one who has had trouble with JSON's poor design choice:

This is getting a bit off-topic. If you wish to argue that JSON is
designed poorly, you should do that in some appropriate JSON forum.
It's not a Python issue.

Now, if you wish to boggle your mind about something pythonic, how about
mutexes not being thread safe (http://bugs.python.org/issue1746071)?
 
S

Steven D'Aprano

Here's a reason: KISS.

KISS is a reason *for* allowing multiple string delimiters, not against
it. The simplicity which matters here are:

* the user doesn't need to memorise which delimiter is allowed, and
which is forbidden, which will be different from probably 50% of
the other languages he knows;

* the user can avoid the plague of escaping quotes inside strings
whenever he needs to embed the delimiter inside a string literal.

This is the 21st century, not 1960, and if the language designer is
worried about the trivially small extra effort of parsing ' as well as "
then he's almost certainly putting his efforts in the wrong place.
 
S

Steve Howell

KISS is a reason *for* allowing multiple string delimiters, not against
it. The simplicity which matters here are:

* the user doesn't need to memorise which delimiter is allowed, and
  which is forbidden, which will be different from probably 50% of
  the other languages he knows;

Exactly. One of the reasons that human use computers in the first
place is that we have flawed memory with respect to details,
especially arbitrary ones. It's the job of the computer to make our
lives easier.
* the user can avoid the plague of escaping quotes inside strings
  whenever he needs to embed the delimiter inside a string literal.

Unlike JSON, JS itself allows '"' and "'", although its canonical
representation of the latter is '\''.

Having said that, I don't mind that JSON is strict; I just hate that
certain JSON parsers give cryptic messages on such an obvious gotcha.
This is the 21st century, not 1960, and if the language designer is
worried about the trivially small extra effort of parsing ' as well as "
then he's almost certainly putting his efforts in the wrong place.

FWIW the JSON parser in Javascript is at least capable of giving a
precise explanation in its error message, which put it ahead of
Python:
config = "{'foo': 'bar'}" "{'foo': 'bar'}"
JSON.parse(config)
SyntaxError: Unexpected token '

(Tested under Chrome and node.js, both based on V8.)

Here's Python:
[...]
ValueError: Expecting property name: line 1 column 1 (char 1)

(Python's implementation at least beats JS by including line/column
info.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,280
Latest member
BGBBrock56

Latest Threads

Top