FAQ: How do I calculate what quoted strings and numbers mean?

P

p.lavarre

Subject: announce: FAQs suggested ...
http://effbot.org/pyfaq/suggest.htm has new FAQ's ...
FAQ: How do I calculate what quoted strings and numbers mean?

A: eval(source, {'builtins': {}}) works, without also accidentally
accepting OS commands as input.

Note: Eval might surprise you if you mistype this idiom as: eval(source, {}).

Note: This idiom makes sense of ordinary Python literals (such as 010, 0x8,
8.125e+0, and "\x45ight"). This idiom also correctly interprets simple
literal expressions, such as 64**0.5.

That suggested FAQ is misleadingly incorrect as stated - we need help
rewording it.

/F correctly commented:
"eval" is never a good choice if you cannot trust the source; it's
trivial to do various denial-of-service attacks. See
http://effbot.org/zone/librarybook-core-eval.htm

Correspondingly, newbie me, I actually did copy the eval(source,
{'builtins': {}}) idiom into some code from that page without noticing
the comments re the cost of evaluating literal expressions like 'a' *
(10**9) , abuses of __subclass__ and mro(), etc.

But those objections miss the point. Having had those troubles
explained to me now, I'm still leaving my code unchanged - it still
does what I mean. That is,

eval(source, {'builtins': {}}) works enough like an evaluator of
literals to let you duck the work of writing that evaluator until you
need it. Yagni.

That's useful, and likely an FAQ. Anybody out there able to say
concisely what we really mean to say here?

Thanks in advance, Pat LaVarre
 
F

Fredrik Lundh

But those objections miss the point. Having had those troubles
explained to me now, I'm still leaving my code unchanged - it still
does what I mean. That is,

eval(source, {'builtins': {}}) works enough like an evaluator of
literals to let you duck the work of writing that evaluator until you
need it. Yagni.

until you forget about it, and someone uses the security hole to take
down your company's site, or steal all the customer data from your
database, or some such thing.

I think the PHP "I don't really get bound parameters; let's explain how
to build SQL statements by hand first" shows that you should avoid doing
things in stupid ways in documentation that's likely to be read by
inexperienced programmers...
> eval(source, {'builtins': {}}) works enough like an evaluator of
> literals to l

eval(source, {'builtins': {}}) doesn't prevent you from using built-ins,
though. it's spelled __builtins__, not builtins:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
> That's useful, and likely an FAQ.

A FAQ that discusses good ways to handle Python-like literals and
expressions would definitely be a useful addition to the FAQ. if nobody
else does anything about it, I'll get there sooner or later.

</F>
 
P

p.lavarre

A FAQ that discusses good ways to handle Python-like literals and
expressions would definitely be a useful addition to the FAQ. if nobody
else does anything about it, I'll get there sooner or later.

Thank you.
eval(source, {'builtins': {}}) doesn't prevent you from using built-ins,
though. it's spelled __builtins__, not builtins:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
NameError: name 'len' is not defined

Grin. Indeed, newbie me, I didn't know that either, thank you.

I was happy enough when I saw an improvement like:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>

Now I fear that I must have copied the misspelled builtins from some
page that's still out there somewhere misleading people ...
 
F

Fredrik Lundh

I was happy enough when I saw an improvement like:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<string>", line 1, in <module>
NameError: name 'os' is not defined

sure, but the os module isn't very far away:
/home/fredrik

</F>
 
P

p.lavarre

but the os module isn't very far away

I see Python defines some evals that apparently don't import os, such
as:

int('0x100'[2:], 0x10)
float('1.23e+4')
binascii.unhexlify('4142')

Does Python also define an eval that doesn't import os for literal
string args?

For example, suppose someone gives me source strings like
repr('\a\r\n') when I need byte strings like '\a\r\n'.
http://docs.python.org/ref/strings.html lists the escapes that need
eval'ling, especially the comparatively heavy and Python-specific \N{}
escape.

Is then a "slow and dangerous" eval my only reasonably concise choice?
 
C

Chris Lambacher

but the os module isn't very far away

I see Python defines some evals that apparently don't import os, such
as:

int('0x100'[2:], 0x10)
float('1.23e+4')
binascii.unhexlify('4142')

Does Python also define an eval that doesn't import os for literal
string args?

For example, suppose someone gives me source strings like
repr('\a\r\n') when I need byte strings like '\a\r\n'.
http://docs.python.org/ref/strings.html lists the escapes that need
eval'ling, especially the comparatively heavy and Python-specific \N{}
escape.


You can use string encoding and decoding to do what you want:hello
there

you can find other escape type encodings here:
http://docs.python.org/lib/standard-encodings.html

-Chris
 
P

p.lavarre

int('0x100', 0x10) ...
... evals that apparently don't import os, such as ...

int('0x100'[2:], 0x10)
float('1.23e+4')
binascii.unhexlify('4142')

... for literal string args?
http://docs.python.org/ref/strings.html

http://docs.python.org/lib/standard-encodings.html
a = r'hello\nthere'
print a.decode('string_escape')

Bingo, thank you:
binascii.hexlify(repr('\xA3')[1:-1].decode('string_escape')) 'a3'
binascii.hexlify(repr(u'\u00A3')[2:-1].decode('unicode_escape').encode('UTF-8')) 'c2a3'
 
P

p.lavarre

This idiom makes sense of ordinary Python literals (such as 010, 0x8,
...
sure, but the os module isn't very far away:

So now I think this is true:

"""
FAQ: How should I calculate what numbers and quoted strings mean?

A:

If you know the type of result you want, then you can choose a
corresponding evaluator, such as:

int('0x100', 0x10)
float('1.23e+4')
binascii.unhexlify('4142')
repr('\xA3')[1:-1].decode('string_escape')
repr(u'\u00A3')[2:-1].decode('unicode_escape')

If you need to allow any of a few types of input, then you can call
these evaluators in order of priority, and catch their exceptions.

If you need to allow any type of literal input, then you can resort to
the "limited" shlex or the "slow and dangerous" eval, or call all the
evaluators, or roll your own lexxer.
"""
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top