Problem of Readability of Python

L

Licheng Fang

Python is supposed to be readable, but after programming in Python for
a while I find my Python programs can be more obfuscated than their C/C
++ counterparts sometimes. Part of the reason is that with
heterogeneous lists/tuples at hand, I tend to stuff many things into
the list and *assume* a structure of the list or tuple, instead of
declaring them explicitly as one will do with C structs. So, what used
to be

struct nameval {
char * name;
int val;
} a;

a.name = ...
a.val = ...

becomes cryptic

a[0] = ...
a[1] = ...

Python Tutorial says an empty class can be used to do this. But if
namespaces are implemented as dicts, wouldn't it incur much overhead
if one defines empty classes as such for some very frequently used
data structures of the program?

Any elegant solutions?
 
S

Steven Bethard

Licheng said:
Python is supposed to be readable, but after programming in Python for
a while I find my Python programs can be more obfuscated than their C/C
++ counterparts sometimes. Part of the reason is that with
heterogeneous lists/tuples at hand, I tend to stuff many things into
the list and *assume* a structure of the list or tuple, instead of
declaring them explicitly as one will do with C structs. So, what used
to be

struct nameval {
char * name;
int val;
} a;

a.name = ...
a.val = ...

becomes cryptic

a[0] = ...
a[1] = ...

Python Tutorial says an empty class can be used to do this. But if
namespaces are implemented as dicts, wouldn't it incur much overhead
if one defines empty classes as such for some very frequently used
data structures of the program?

Any elegant solutions?

You can use __slots__ to make objects consume less memory and have
slightly better attribute-access performance. Classes for objects that
need such performance tweaks should start like::

class A(object):
__slots__ = 'name', 'val'

The recipe below fills in the obvious __init__ method for such classes
so that the above is pretty much all you need to write:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/502237


STeVe
 
B

Bjoern Schliessmann

Licheng said:
struct nameval {
char * name;
int val;
} a;

a.name = ...
a.val = ...

becomes cryptic

a[0] = ...
a[1] = ...

?!

(1)
a = {}
a["name"] = ...
a["val"] = ...

(2)
NAME = 0
VAL = 1
a=[]
a[NAME] = ...
a[VAL] = ...
Python Tutorial says an empty class can be used to do this. But if
namespaces are implemented as dicts, wouldn't it incur much
overhead if one defines empty classes as such for some very
frequently used data structures of the program?

Measure first, optimize later. How many million of instances and/or
accesses per second do you have?

Regards,


Björn
 
B

Bruno Desthuilliers

Licheng Fang a écrit :
Python is supposed to be readable, but after programming in Python for
a while I find my Python programs can be more obfuscated than their C/C
++ counterparts sometimes. Part of the reason is that with
heterogeneous lists/tuples at hand, I tend to stuff many things into
the list and *assume* a structure of the list or tuple, instead of
declaring them explicitly as one will do with C structs. So, what used
to be

struct nameval {
char * name;
int val;
} a;

a.name = ...
a.val = ...

becomes cryptic

a[0] = ...
a[1] = ...

Use dicts, not lists or tuples:

a = dict(name='yadda', val=42)
print a['name']
print a['val']
Python Tutorial says an empty class can be used to do this. But if
namespaces are implemented as dicts, wouldn't it incur much overhead
if one defines empty classes as such for some very frequently used
data structures of the program?

If you do worry about overhead, then C is your friend !-)

More seriously: what do you use this 'nameval' struct for ? If you
really have an overhead problem, you may want to use a real class using
__slots__ to minimize this problem, but chances are you don't need it.
 
G

George Sakkis

Licheng said:
Python is supposed to be readable, but after programming in Python for
a while I find my Python programs can be more obfuscated than their C/C
++ counterparts sometimes. Part of the reason is that with
heterogeneous lists/tuples at hand, I tend to stuff many things into
the list and *assume* a structure of the list or tuple, instead of
declaring them explicitly as one will do with C structs. So, what used
to be
struct nameval {
char * name;
int val;
} a;
a.name = ...
a.val = ...
becomes cryptic
a[0] = ...
a[1] = ...
Python Tutorial says an empty class can be used to do this. But if
namespaces are implemented as dicts, wouldn't it incur much overhead
if one defines empty classes as such for some very frequently used
data structures of the program?
Any elegant solutions?

You can use __slots__ to make objects consume less memory and have
slightly better attribute-access performance. Classes for objects that
need such performance tweaks should start like::

class A(object):
__slots__ = 'name', 'val'

The recipe below fills in the obvious __init__ method for such classes
so that the above is pretty much all you need to write:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/502237

STeVe

For immutable records, you may also want to check out the named tuples
recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/500261

George
 
S

Steven Bethard

George said:
Licheng said:
Python is supposed to be readable, but after programming in Python for
a while I find my Python programs can be more obfuscated than their C/C
++ counterparts sometimes. Part of the reason is that with
heterogeneous lists/tuples at hand, I tend to stuff many things into
the list and *assume* a structure of the list or tuple, instead of
declaring them explicitly as one will do with C structs. So, what used
to be
struct nameval {
char * name;
int val;
} a;
a.name = ...
a.val = ...
becomes cryptic
a[0] = ...
a[1] = ...
Python Tutorial says an empty class can be used to do this. But if
namespaces are implemented as dicts, wouldn't it incur much overhead
if one defines empty classes as such for some very frequently used
data structures of the program?
Any elegant solutions?
You can use __slots__ to make objects consume less memory and have
slightly better attribute-access performance. Classes for objects that
need such performance tweaks should start like::

class A(object):
__slots__ = 'name', 'val'

The recipe below fills in the obvious __init__ method for such classes
so that the above is pretty much all you need to write:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/502237

For immutable records, you may also want to check out the named tuples
recipe: http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/500261

Yep, it's linked in the description of the first recipe.

STeVe
 
B

Brian Elmegaard

Use dicts, not lists or tuples:

a = dict(name='yadda', val=42)
print a['name']
print a['val']

I guess you will then need a list or tuple to store the dicts?

I might have made it with a list of class instances:

class a:
def __init__(self,name,val):
self.name=name
self.val=val

l=list()
l.append(a('yadda',42))
print l[0].name
print l[0].val

Is the dict preferable to a list or tuple of class instances?
 
A

Alex Martelli

Licheng Fang said:
Python Tutorial says an empty class can be used to do this. But if
namespaces are implemented as dicts, wouldn't it incur much overhead
if one defines empty classes as such for some very frequently used
data structures of the program?

Just measure:

$ python -mtimeit -s'class A(object):pass' -s'a=A()' 'a.zop=23'
1000000 loops, best of 3: 0.241 usec per loop

$ python -mtimeit -s'a=[None]' 'a[0]=23'
10000000 loops, best of 3: 0.156 usec per loop

So, the difference, on my 18-months-old laptop, is about 85 nanoseconds
per write-access; if you have a million such accesses in a typical run
of your program, it will slow the program down by about 85 milliseconds.
Is that "much overhead"? If your program does nothing else except those
accesses, maybe, but then why are your writing that program AT ALL?-)

And yes, you CAN save about 1/3 of those 85 nanoseconds by having
'__slots__=["zop"]' in your class A(object)... but that's the kind of
thing one normally does only to tiny parts of one's program that have
been identified by profiling as dramatic bottlenecks, to shave off the
last few nanoseconds in the very last stages of micro-optimization of a
program that's ALMOST, but not QUITE, fast enough... knowing about such
"extreme last-ditch optimization tricks" is of very doubtful value (and
I think I'm qualified to say that, since I _do_ know many of them...:).
There ARE important performance things to know about Python, but those
worth a few nanoseconds don't matter much.


Alex
 
P

Paul McGuire

Python is supposed to be readable, but after programming in Python for
a while I find my Python programs can be more obfuscated than their C/C
++ counterparts sometimes. Part of the reason is that with
heterogeneous lists/tuples at hand, I tend to stuff many things into
the list and *assume* a structure of the list or tuple, instead of
declaring them explicitly as one will do with C structs. So, what used
to be

struct nameval {
char * name;
int val;

} a;

a.name = ...
a.val = ...

becomes cryptic

a[0] = ...
a[1] = ...

Python Tutorial says an empty class can be used to do this. But if
namespaces are implemented as dicts, wouldn't it incur much overhead
if one defines empty classes as such for some very frequently used
data structures of the program?

Any elegant solutions?

"""Just use a single empty class (such as the AttributeContainer
below)
and then use different instances of the class for different sets
of name/value pairs. (This type of class also goes by the name
Bag,
but that name is too, um, nondescript for me.) You can see from
the
example that there is no requirement for names to be shared,
unshared,
common, or unique.

-- Paul"""

class AttributeContainer(object):
pass

a = AttributeContainer()
a.name = "Lancelot"
a.favorite_color = "blue"

b = AttributeContainer()
b.name = "European swallow"
b.laden = true
b.airspeed = 20
 
J

John Nagle

Licheng said:
Python is supposed to be readable, but after programming in Python for
a while I find my Python programs can be more obfuscated than their C/C
++ counterparts sometimes. Part of the reason is that with
heterogeneous lists/tuples at hand, I tend to stuff many things into
the list and *assume* a structure of the list or tuple, instead of
declaring them explicitly as one will do with C structs.

Comments might help.

It's common to use tuples that way, but slightly bad form to
use lists that way.

Of course, you can create a class and use "slots" to bind
the positions at compile time, so you don't pay for a dictionary
lookup on every feature.

(Someday I need to overhaul BeautifulSoup to use "slots".
That might speed it up.)

John Nagle
 
J

John Machin

Licheng said:
Python is supposed to be readable, but after programming in Python for
a while I find my Python programs can be more obfuscated than their C/C
++ counterparts sometimes. Part of the reason is that with
heterogeneous lists/tuples at hand, I tend to stuff many things into
the list and *assume* a structure of the list or tuple, instead of
declaring them explicitly as one will do with C structs. So, what used
to be

struct nameval {
char * name;
int val;
} a;

a.name = ...
a.val = ...

becomes cryptic

a[0] = ...
a[1] = ...

Python Tutorial says an empty class can be used to do this. But if
namespaces are implemented as dicts, wouldn't it incur much overhead
if one defines empty classes as such for some very frequently used
data structures of the program?

Any elegant solutions?

You can use __slots__ to make objects consume less memory and have
slightly better attribute-access performance. Classes for objects that
need such performance tweaks should start like::

class A(object):
__slots__ = 'name', 'val'

The recipe below fills in the obvious __init__ method for such classes
so that the above is pretty much all you need to write:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/502237

If not needing/wanting __slots__, something simpler (no metaclasses!)
like the following helps the legibility/utility:

<file record.py>

class BaseRecord(object):

def __init__(self, **kwargs):
for k, v in kwargs.iteritems():
setattr(self, k, v)

def dump(self, text=''):
print '== dumping %s instance: %s' % (self.__class__.__name__,
text)
for k, v in sorted(self.__dict__.iteritems()):
print ' %s: %r' % (k, v)
</file record.py>


Python 2.5.1 (r251:54863, Apr 18 2007, 08:51:08) [MSC v.1310 32 bit
(Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information..... pass
........ pass
....== dumping A instance:
bar: 'rab'
foo: 1
zot: (1, 2)== dumping A instance: more of the same
bar: 'xxx'
foo: 2
zot: (42, 666)== dumping A instance: after poking
bar: 'rab'
foo: 1
ugh: 'poked in'
zot: (1, 2)
>>> b1 = B()
>>> b1.dump() == dumping B instance:
>>> b1.esrever = 'esrever'[::-1]
>>> b1.dump()
== dumping B instance:
esrever: 'reverse'
HTH,
John
 
S

Steven D'Aprano

And yes, you CAN save about 1/3 of those 85 nanoseconds by having
'__slots__=["zop"]' in your class A(object)... but that's the kind of
thing one normally does only to tiny parts of one's program that have
been identified by profiling as dramatic bottlenecks

Seems to me that:

class Record(object):
__slots__ = ["x", "y", "z"]


has a couple of major advantages over:

class Record(object):
pass


aside from the micro-optimization that classes using __slots__ are faster
and smaller than classes with __dict__.

(1) The field names are explicit and self-documenting;
(2) You can't accidentally assign to a mistyped field name without Python
letting you know immediately.


Maybe it's the old Pascal programmer in me coming out, but I think
they're big advantages.
 
A

Alex Martelli

Steven D'Aprano said:
And yes, you CAN save about 1/3 of those 85 nanoseconds by having
'__slots__=["zop"]' in your class A(object)... but that's the kind of
thing one normally does only to tiny parts of one's program that have
been identified by profiling as dramatic bottlenecks

Seems to me that:

class Record(object):
__slots__ = ["x", "y", "z"]


has a couple of major advantages over:

class Record(object):
pass


aside from the micro-optimization that classes using __slots__ are faster
and smaller than classes with __dict__.

(1) The field names are explicit and self-documenting;
(2) You can't accidentally assign to a mistyped field name without Python
letting you know immediately.


Maybe it's the old Pascal programmer in me coming out, but I think
they're big advantages.

I'm also an old Pascal programmer (ask anybody who was at IBM in the
'80s who was the most active poster on the TURBO FORUM about Turbo
Pascal, and PASCALVS FORUM about Pascal/Vs...), and yet I consider these
"advantages" to be trivial in most cases compared to the loss in
flexibility, such as the inability to pickle (without bothering to code
an explicit __getstate__) and the inability to "monkey-patch" instances
on the fly -- not to mention the bother of defining a separate 'Record'
class for each and every combination of attributes you might want to put
together.

If you REALLY pine for Pascal's records, you might choose to inherit
from ctypes.Structure, which has the additional "advantages" of
specifying a C type for each field and (a real advantage;-) creating an
appropriate __init__ method.
.... _fields_ =
(('x',ctypes.c_float),('y',ctypes.c_float),('z',ctypes.c_float)
)
.... Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: float expected instead of str instance

See? You get type-checking too -- Pascal looms closer and closer!-)

And if you need an array of 1000 such Records, just use as the type
Record*1000 -- think of the savings in memory (no indirectness, no
overallocations as lists may have...).

If I had any real need for such things, I'd probably use a metaclass (or
class decorator) to also add a nice __repr__ function, etc...


Alex
 
S

Steven Bethard

Alex said:
Steven D'Aprano said:
class Record(object):
__slots__ = ["x", "y", "z"]

has a couple of major advantages over:

class Record(object):
pass

aside from the micro-optimization that classes using __slots__ are faster
and smaller than classes with __dict__.

(1) The field names are explicit and self-documenting;
(2) You can't accidentally assign to a mistyped field name without Python
letting you know immediately.
[snip]
If I had any real need for such things, I'd probably use a metaclass (or
class decorator) to also add a nice __repr__ function, etc...

Yep. That's what the recipe I posted [1] does. Given a class like::

class C(Record):
__slots__ = 'x', 'y', 'z'

it adds the most obvious __init__ and __repr__ methods. Raymond's
NamedTuple recipe [2] has a similar effect, though the API is different.

[1] http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/502237
[2] http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/500261

STeVe
 
M

Michele Simionato

If you REALLY pine for Pascal's records, you might choose to inherit
from ctypes.Structure, which has the additional "advantages" of
specifying a C type for each field and (a real advantage;-) creating an
appropriate __init__ method.


... _fields_ =
(('x',ctypes.c_float),('y',ctypes.c_float),('z',ctypes.c_float)
)
...>>> r=Record()

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: float expected instead of str instance

See? You get type-checking too -- Pascal looms closer and closer!-)

And if you need an array of 1000 such Records, just use as the type
Record*1000 -- think of the savings in memory (no indirectness, no
overallocations as lists may have...).

That's very cool Alex! I have just a question: suppose I want
to measure the memory allocation of a million of records made
with ctypes vs the memory allocation of equivalent
records made with __slots__, how do I measure it? Say on Linux,
Mac and Windows?
If ctypes records are more efficient than __slots__ records,
I will ask for deprecation of __slots__ (which is something
I wanted from the beginning!).


Michele Simionato
 
B

Bruno Desthuilliers

Brian Elmegaard a écrit :
Use dicts, not lists or tuples:

a = dict(name='yadda', val=42)
print a['name']
print a['val']

I guess you will then need a list or tuple to store the dicts?

Should be a list then IMHO. But then it's the correct use of a list : an
homegenous collection.
 
S

Steven D'Aprano

Steven said:
You can use __slots__ [...]

Aaaugh! Don't use __slots__!

Seriously, __slots__ are for wizards writing applications with huuuge
numbers of object instances (like, millions of instances). For an
extended thread about this, see

http://groups.google.com/group/comp.lang.python/browse_thread/
thread/8775c70565fb4a65/0e25f368e23ab058

Well, I've read the thread, and I've read the thread it links to, and for
the life of me I'm still no clearer as to why __slots__ shouldn't be used
except that:

1 Aahz and Guido say __slots__ are teh suxxor;

2 rumour (?) has it that __slots__ won't make it into Python 3.0;

3 inheritance from classes using __slots__ doesn't inherit the slot-
nature of the superclass.


Point 1 is never to be lightly dismissed, but on the other hand Guido
doesn't like reduce(), and I'm allergic to "Cos I Said So" arguments.

History is full of things which were invented for one purpose being used
for something else. So, that being the case, suppose I accept that using
__slots__ is not the best way of solving the problem, and that people of
the skill and experience of Guido and Aahz will roll their eyes and
snicker at me.

But is there actually anything *harmful* that can happen if I use
__slots__?
 
D

Diez B. Roggisch

Steven said:
Steven said:
You can use __slots__ [...]

Aaaugh! Don't use __slots__!

Seriously, __slots__ are for wizards writing applications with huuuge
numbers of object instances (like, millions of instances). For an
extended thread about this, see

http://groups.google.com/group/comp.lang.python/browse_thread/
thread/8775c70565fb4a65/0e25f368e23ab058

Well, I've read the thread, and I've read the thread it links to, and for
the life of me I'm still no clearer as to why __slots__ shouldn't be used
except that:

1 Aahz and Guido say __slots__ are teh suxxor;

2 rumour (?) has it that __slots__ won't make it into Python 3.0;

3 inheritance from classes using __slots__ doesn't inherit the slot-
nature of the superclass.


Point 1 is never to be lightly dismissed, but on the other hand Guido
doesn't like reduce(), and I'm allergic to "Cos I Said So" arguments.

History is full of things which were invented for one purpose being used
for something else. So, that being the case, suppose I accept that using
__slots__ is not the best way of solving the problem, and that people of
the skill and experience of Guido and Aahz will roll their eyes and
snicker at me.

But is there actually anything *harmful* that can happen if I use
__slots__?

Point 3 clearly is harmful. As is the fact that __slots__ gives you troubles
if you e.g. pass objects to code that tries to set arbitrary attributes on
an object. While this might be frowned upon, it can be useful in situations
where you e.g. link GUI-code/objects with data-objects: instead of creating
cumbersome, explicit mappings (as you have to in C/C++/Java) or wrappers,
just set a well-named property.

The question is: what does a slot buy you for this kind of problem? And
while arguing with "then I can't set an attribute I didn't want to be set"
is certainly possible, it ultimately leads to the darn
static-vs-dynamic-discussion. Which we might spare us this time.

Diez
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,227
Latest member
Daniella65

Latest Threads

Top