Default scope of variables

  • Thread starter Steven D'Aprano
  • Start date
D

Dave Angel

]
Anyway, none of the calculations that has been given takes into account
the fact that names can be /less/ than one million characters long.


Not in *my* code they don't!!!

*wink*

The
actual number of non-empty strings of length at most 1000000 characters,
that consist only of ascii letters, digits or underscores, and that
don't start with a digit, is

sum(53*63**i for i in range(1000000)) == 53*(63**1000000 - 1)//62


I take my hat of to you sir, or possibly madam. That is truly an inspired
piece of pedantry.

It's perhaps worth mentioning that some non-ascii characters are allowed
in identifiers in Python 3, though I don't know which ones.

PEP 3131 describes the rules:

http://www.python.org/dev/peps/pep-3131/

For example:

py> import unicodedata as ud
py> for c in 'é極¿μЖᚃ‰⇄∞':
... print(c, ud.name(c), c.isidentifier(), ud.category(c))
...
é LATIN SMALL LETTER E WITH ACUTE True Ll
æ LATIN SMALL LETTER AE True Ll
Â¥ YEN SIGN False Sc
µ MICRO SIGN True Ll
¿ INVERTED QUESTION MARK False Po
μ GREEK SMALL LETTER MU True Ll
Ж CYRILLIC CAPITAL LETTER ZHE True Lu
ᚃ OGHAM LETTER FEARN True Lo
‰ PER MILLE SIGN False Po
⇄ RIGHTWARDS ARROW OVER LEFTWARDS ARROW False So
∞ INFINITY False Sm

The isidentifier() method will let you weed out the characters that
cannot start an identifier. But there are other groups of characters
that can appear after the starting "letter". So a more reasonable
sample might be something like:
py> import unicodedata as ud
py> for c in 'é極¿μЖᚃ‰⇄∞':
... xc = "X" + c
... print(c, ud.name(c), xc.isidentifier(), ud.category(c))
...

In particular,
http://docs.python.org/3.3/reference/lexical_analysis.html#identifiers

has a definition for id_continue that includes several interesting
categories. I expected the non-ASCII digits, but there's other stuff
there, like "nonspacing marks" that are surprising.

I'm pretty much speculating here, so please correct me if I'm way off.
 
J

Joshua Landau

The isidentifier() method will let you weed out the characters that cannot
start an identifier. But there are other groups of characters that can
appear after the starting "letter". So a more reasonable sample might be
something like: ....
In particular,
http://docs.python.org/3.3/reference/lexical_analysis.html#identifiers

has a definition for id_continue that includes several interesting
categories. I expected the non-ASCII digits, but there's other stuff there,
like "nonspacing marks" that are surprising.

I'm pretty much speculating here, so please correct me if I'm way off.

For my calculation above, I used this code I quickly mocked up:
import unicodedata as unidata
from sys import maxunicode
from collections import defaultdict
from itertools import chain

def get():
xid_starts = set()
xid_continues = set()

id_start_categories = "Lu, Ll, Lt, Lm, Lo, Nl".split(", ")
id_continue_categories = "Mn, Mc, Nd, Pc".split(", ")

characters = (chr(n) for n in range(maxunicode + 1))

print("Making normalized characters")

normalized = (unidata.normalize("NFKC", character) for character in characters)
normalized = set(chain.from_iterable(normalized))

print("Assigning to categories")

for character in normalized:
category = unidata.category(character)

if category in id_start_categories:
xid_starts.add(character)
elif category in id_continue_categories:
xid_continues.add(character)

return xid_starts, xid_continues

Please note that "xid_continues" actually represents "xid_continue - xid_start".
 
R

Rotwang

]
Anyway, none of the calculations that has been given takes into account
the fact that names can be /less/ than one million characters long.


Not in *my* code they don't!!!

*wink*

The
actual number of non-empty strings of length at most 1000000 characters,
that consist only of ascii letters, digits or underscores, and that
don't start with a digit, is

sum(53*63**i for i in range(1000000)) == 53*(63**1000000 - 1)//62


I take my hat of to you sir, or possibly madam. That is truly an inspired
piece of pedantry.

FWIW, I'm male.


Thanks.
 
N

Neil Cerutti

On 07/04/2013 01:32 AM, Steven D'Aprano wrote:
Well, if I ever have more than 63,000,000 variables[1] in a
function, I'll keep that in mind.
[1] Based on empirical evidence that Python supports names
with length at least up to one million characters long, and
assuming that each character can be an ASCII letter, digit or
underscore.

Well, the number wouldn't be 63,000,000. Rather it'd be
63**1000000

You should really count only the ones somebody might actually
want to use. That's a much, much smaller number, though still
plenty big.

Inner scopes (I don't remember the official name) is a great
feature of C++. It's not the right feature for Python, though,
since Python doesn't have deterministic destruction. It wouldn't
buy much except for namespace tidyness.

for x in range(4):
print(x)
print(x) # Vader NOoooooOOOOOO!!!

Python provides deterministic destruction with a different
feature.
 
C

Chris Angelico

Python provides deterministic destruction with a different
feature.

You mean 'with'? That's not actually destruction, it just does one of
the same jobs that deterministic destruction is used for (RAII). It
doesn't, for instance, have any influence on memory usage, nor does it
ensure the destruction of the object's referents. But yes, it does
achieve (one of) the most important role(s) of destruction.

ChrisA
 
N

Neil Cerutti

You mean 'with'? That's not actually destruction, it just does
one of the same jobs that deterministic destruction is used for
(RAII). It doesn't, for instance, have any influence on memory
usage, nor does it ensure the destruction of the object's
referents. But yes, it does achieve (one of) the most important
role(s) of destruction.

Yes, thanks. I meant the ability to grab and release a
resource deterministically.
 
W

Wayne Werner

Oh. Uhm... ahh... it would have helped to mention that it also has a
commit() method! But yes, that's correct; if the object expires (this
is C++, so it's guaranteed to call the destructor at that close brace
- none of the Python vagueness about when __del__ is called) without
commit() being called, then the transaction will be rolled back.

If one wants to duplicate this kind of behavior in Python, that's what
context managers are for combined with a `with` block, which does
guarantee that the __exit__ method will be called - in this case it could
be something as simple as:

from contextlib import contextmanager

@contextmanager
def new_transaction(conn):
tran = conn.begin_transaction()
yield tran
if not tran.committed:
tran.rollback()



Which you would then use like:


conn = create_conn()
with new_transaction(conn) as tran:
rows_affected = do_query_stuff(tran)
if rows_affected == 42:
tran.commit()



And then you get the desired constructor/destructor behavior of having
guaranteed that code will be executed at the start and at the end. You can
wrap things in try/catch for some error handling, or write your own
context manager class.

HTH,
Wayne
 
C

Chris Angelico

Which you would then use like:


conn = create_conn()
with new_transaction(conn) as tran:
rows_affected = do_query_stuff(tran)
if rows_affected == 42:
tran.commit()

Yep. There's a problem, though, when you bring in subtransactions. The
logic wants to be like this:

with new_transaction(conn) as tran:
tran.query("blah")
with tran.subtransaction() as tran:
tran.query("blah")
with tran.subtransaction() as tran:
tran.query("blah")
# roll this subtransaction back
tran.query("blah")
tran.commit()
tran.query("blah")
tran.commit()

The 'with' statement doesn't allow this. I would need to use some kind
of magic to rebind the old transaction to the name, or else use a list
that gets magically populated:

with new_transaction(conn) as tran:
tran[-1].query("blah")
with subtransaction(tran):
tran[-1].query("blah")
with subtransaction(tran):
tran[-1].query("blah")
# roll this subtransaction back
tran[-1].query("blah")
tran[-1].commit()
tran[-1].query("blah")
tran[-1].commit()

I don't like the look of this. It might work, but it's hardly ideal.
This is why I like to be able to nest usages of the same name.

ChrisA
 
S

Steven D'Aprano

Which you would then use like:


conn = create_conn()
with new_transaction(conn) as tran:
rows_affected = do_query_stuff(tran)
if rows_affected == 42:
tran.commit()

Yep. There's a problem, though, when you bring in subtransactions. The
logic wants to be like this: [snip hideous code]
I don't like the look of this. It might work, but it's hardly ideal.
This is why I like to be able to nest usages of the same name.

Yes, and the obvious way to nest usages of the same name is to use a
instance with a class attribute and instance attribute of the same name:

class Example:
attr = 23

x = Example()
x.attr = 42
print(x.attr)
del x.attr
print(x.attr)


If you need more than two levels, you probably ought to re-design your
code to be less confusing, otherwise you may be able to use ChainMap to
emulate any number of nested scopes.

One interesting trick is you can use a ChainMap as function globals.
Here's a sketch for what you can do in Python 3.3:


from types import FunctionType
from collections import ChainMap

class _ChainedDict(ChainMap, dict):
# Function dicts must be instances of dict :-(
pass


def chained_function(func, *dicts):
"""Return a new function, copied from func, using a
ChainMap as dict.
"""
dicts = dicts + (func.__globals__, builtins.__dict__)
d = _ChainedDict(*dicts)
name = func.__name__
newfunc = FunctionType(
func.__code__, d, name, closure=func.__closure__)
newfunc.__dict__.update(func.__dict__)
newfunc.__defaults__ = func.__defaults__
return newfunc



And in use:

py> f = chained_function(lambda x: x+y, {'y': 100})
py> f(1)
101
py> f.__globals__.maps.insert(0, {'y': 200})
py> f(1)
201
py> del f.__globals__.maps[0]['y']
py> f(1)
101
 
S

Steven D'Aprano

for x in range(4):
print(x)
print(x) # Vader NOoooooOOOOOO!!!

That loops do *not* introduce a new scope is a feature, not a bug. It is
*really* useful to be able to use the value of x after the loop has
finished. That's a much more common need than being able to have an x
inside the loop and an x outside the loop.
 
E

Ethan Furman

Yep. There's a problem, though, when you bring in subtransactions. The
logic wants to be like this:

Is there some reason you can't simply do this?

with new_transaction(conn) as tran1:
tran1.query("blah")
with tran1.subtransaction() as tran2:
tran2.query("blah")
with tran2.subtransaction() as tran3:
tran3.query("blah")
# roll this subtransaction back
tran2.query("blah")
tran2.commit()
tran1.query("blah")
tran1.commit()
 
E

Ethan Furman

The 'with' statement doesn't allow this. I would need to use some kind
of magic to rebind the old transaction to the name, or else use a list
that gets magically populated:

with new_transaction(conn) as tran:
tran[-1].query("blah")
with subtransaction(tran):
tran[-1].query("blah")
with subtransaction(tran):
tran[-1].query("blah")
# roll this subtransaction back
tran[-1].query("blah")
tran[-1].commit()
tran[-1].query("blah")
tran[-1].commit()

The other option is to build the magic into the new_transaction class, then your code will look like:

with new_transaction(conn) as tran:
tran.query("blah")
with tran.subtransaction():
tran.query("blah")
with tran.subtransaction():
tran.query("blah")
# roll this subtransaction back
tran.query("blah")
tran.commit()
tran.query("blah")
tran.commit()

This would definitely make more sense in a loop. ;)
 
C

Chris Angelico

If you need more than two levels, you probably ought to re-design your
code to be less confusing, otherwise you may be able to use ChainMap to
emulate any number of nested scopes.

The subtransactions are primarily to represent the database equivalent
of a try/except block, so they need to be able to be nested
arbitrarily.

ChrisA
 
C

Chris Angelico

Is there some reason you can't simply do this?

with new_transaction(conn) as tran1:
tran1.query("blah")
with tran1.subtransaction() as tran2:
tran2.query("blah")
with tran2.subtransaction() as tran3:
tran3.query("blah")

# roll this subtransaction back
tran2.query("blah")
tran2.commit()
tran1.query("blah")
tran1.commit()

That means that I, as programmer, have to keep track of the nesting
level of subtransactions. Extremely ugly. A line of code can't be
moved around without first checking which transaction object to work
with.

ChrisA
 
S

Steven D'Aprano

]
That means that I, as programmer, have to keep track of the nesting
level of subtransactions. Extremely ugly. A line of code can't be moved
around without first checking which transaction object to work with.

I feel your pain, but I wonder why we sometimes accept "a line of code
can't be moved around" as an issue to be solved by the language. After
all, in general most lines of code can't be moved around.
 
C

Chris Angelico

]
That means that I, as programmer, have to keep track of the nesting
level of subtransactions. Extremely ugly. A line of code can't be moved
around without first checking which transaction object to work with.

I feel your pain, but I wonder why we sometimes accept "a line of code
can't be moved around" as an issue to be solved by the language. After
all, in general most lines of code can't be moved around.

It's not something to be solved by the language, but it's often
something to be solved by the program's design. Two lines of code that
achieve the same goal should normally look the same. This is why
Python's policy is "one obvious way to do something" rather than
"spell it five different ways in the same file to make a nightmare for
other people coming after you". Why should database queries be spelled
"trans1.query()" in one place, and "trans2.query()" in another?
Similarly, if I want to call another function and that function needs
to use the database, why should I pass it trans3 and have that come
out as trans1 on the other side? Unnecessarily confusing. Makes much
more sense to use the same name everywhere.

ChrisA
 
S

Steven D'Aprano

It's not something to be solved by the language, but it's often
something to be solved by the program's design. Two lines of code that
achieve the same goal should normally look the same. This is why
Python's policy is "one obvious way to do something" rather than "spell
it five different ways in the same file to make a nightmare for other
people coming after you". Why should database queries be spelled
"trans1.query()" in one place, and "trans2.query()" in another?

Is that a trick question? They probably shouldn't. But it's a big leap
from that to "...and therefore `for` and `while` should introduce their
own scope".

Similarly, if I want to call another function and that function needs to
use the database, why should I pass it trans3 and have that come out as
trans1 on the other side? Unnecessarily confusing. Makes much more sense
to use the same name everywhere.

"Is your name not Bruce? That's going to cause a little confusion."

 
C

Chris Angelico

Is that a trick question? They probably shouldn't. But it's a big leap
from that to "...and therefore `for` and `while` should introduce their
own scope".

No, it's not a trick question; I was responding to Ethan's suggestion
as well as yours, and he was saying pretty much that.

BruceA
(maybe that'll reduce the confusion?)
 
N

Neil Cerutti

That loops do *not* introduce a new scope is a feature, not a bug. It is
*really* useful to be able to use the value of x after the loop has
finished.

I don't buy necessarily buy that it's "*really*" useful but I do
like introducing new names in (not really the scope of)
if/elif/else and for statement blocks.

z = record["Zip"]
if int(z) > 99999:
zip_code = z[:-4].rjust(5, "0")
zip4 = z[-4:]
else:
zip_code = z.rjust(5, "0")
zip4 = ""


As opposed to:

zip_code = None
zip4 = None
z = record["Zip"]
if int(z) > 99999:
zip_code = z[:-4].rjust(5, "0")
zip4 = z[-4:]
else:
zip_code = z.rjust(5, "0")
zip4 = ""
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,170
Latest member
Andrew1609
Top