Confounded by Python objects

B

boblatest

Hello group,

take a look at the code snippet below. What I want to do is initialize
two separate Channel objects and put some data in them. However,
although two different objects are in fact created (as can be seen
from the different names they spit out with the "diag()" method), the
data in the "sample" member is the same although I stick different
values in.

Thanks,
robert
(Sorry for Google Groups, but I don't have NNTP at work)

Here's the code:

#!/usr/bin/python

class Channel:
name = ''
sample = []

def __init__(self, name):
self.name = name

def append(self, time, value):
self.sample.append((time, value))
self.diag()

def diag(self):
print (self.name, self.sample)


chA = Channel('A')
chB = Channel('B')

chA.append(1, 1.1)
chB.append(2, 2.1)
chA.append(3, 1.2)
chB.append(4, 2.2)

print 'Result:'

chA.diag()
chB.diag()

------------------------------------
and here's the output:

('A', [(1, 1.1000000000000001)])
('B', [(1, 1.1000000000000001), (2, 2.1000000000000001)])
('A', [(1, 1.1000000000000001), (2, 2.1000000000000001), (3, 1.2)])
('B', [(1, 1.1000000000000001), (2, 2.1000000000000001), (3, 1.2), (4,
2.2000000000000002)])
Result:
('A', [(1, 1.1000000000000001), (2, 2.1000000000000001), (3, 1.2), (4,
2.2000000000000002)])
('B', [(1, 1.1000000000000001), (2, 2.1000000000000001), (3, 1.2), (4,
2.2000000000000002)])

What I'd like to see, however, is 2 tuples per Channel object, like
this:
('A', [(1, 1.1000000000000001)])
('B', [(2, 2.1000000000000001)])
('A', [(1, 1.1000000000000001), (3, 1.2)])
('B', [(2, 2.1000000000000001), (4, 2.2000000000000002)])
Result:
('A', [(1, 1.1000000000000001), (3, 1.2)])
('B', [(2, 2.1000000000000001), (4, 2.2000000000000002)])
 
F

Fredrik Lundh

take a look at the code snippet below. What I want to do is initialize
two separate Channel objects and put some data in them. However,
although two different objects are in fact created (as can be seen
from the different names they spit out with the "diag()" method), the
data in the "sample" member is the same although I stick different
values in.

that's because you only have one sample object -- the one owned by
the class object.

since you're modifying that object in place (via the append method),
your changes will be shared by all instances. python never copies
attributes when it creates an instance; if you want a fresh object,
you have to create it yourself.
class Channel:

tip: if you're not 100% sure why you would want to put an attribute
def __init__(self, name):
self.name = name
self.sample = [] # create fresh container for instance
def append(self, time, value):
self.sample.append((time, value))
self.diag()

def diag(self):
print (self.name, self.sample)

hope this helps!

</F>
 
A

alex23

class Channel:
name = ''
sample = []

def __init__(self, name):
self.name = name

def append(self, time, value):
self.sample.append((time, value))
self.diag()

def diag(self):
print (self.name, self.sample)

Okay, the problem is you're appending to a _class_ attribute, not to
an instance attribute.

If you change your class definition to this, it should work:

class Channel:
def __init__(self, name):
self.name = name
self.sample = []

That will provide each instance with its own version of the sample
attribute.

The 'self.name = name' in the __init__ for your original code binds a
new attribute to the instance, whereas 'self.sample.append(...' in the
class's append was appending to the class attribute instead.

Hope this helps.

- alex23
 
L

Lawrence D'Oliveiro

In message
class Channel:
name = ''
sample = []

These are class variables, not instance variables. Take them out, and ...
def __init__(self, name):
self.name = name

.... add this line to the above function

self.sample = []
 
B

boblatest

tip: if you're not 100% sure why you would want to put an attribute
on the class level, don't do it.

The reason I did it was sort of C++ish (that's where I come from): I
somehow wanted a list of attributes on the class level. More for
readibility than anything elase, really.
hope this helps!

Yup, did the trick. Thanks!
robert
 
S

satoru

The reason I did it was sort of C++ish (that's where I come from): I
somehow wanted a list of attributes on the class level. More for
readibility than anything elase, really.


Yup, did the trick. Thanks!
robert

yes, i thought your code is kind of static, so it didn't work for a
dynamic language like python.
in python, you don't have to say "static" to make an variable a class
variable, so the "name" and "sample" you kind of "declared" is indeed
class variables.
you may wonder why then the two instaces of "Channel" has different
names, that's because you assign to name in "__init__" and make it an
instance variable that shared the name "name" with a class variable.
As to "sample", it never get assigned to and when you say "append" the
class variable is changed in place.
hope my explaination helps.
 
R

Robert Latest

satoru said:
As to "sample", it never get assigned to and when you say "append" the
class variable is changed in place.
hope my explaination helps.

Sure does, thanks a lot.

Here's an interesting side note: After fixing my "Channel" thingy the
whole project behaved as expected. But there was an interesting hitch.
The main part revolves around another class, "Sequence", which has a
list of Channels as attribute. I was curious about the performance of my
script, because eventually this construct is supposed to handle
megabytes of data. So I wrote a simple loop that creates a new Sequence,
fills all the Channels with data, and repeats.

Interistingly, the first couple of dozens iterations went satisfactorily
quickly (took about 1 second total), but after a hundred or so times it
got really slow -- like a couple of seconds per iteration.

Playing around with the code, not really knowing what to do, I found
that in the "Sequence" class I had again erroneously declared a class-level
attribute -- rather harmlessly, just a string, that got assigned to once in each
iteration on object creation.

After I had deleted that, the loop went blindingly fast without slowing
down.

What's the mechanics behind this behavior?

robert
 
S

Steven D'Aprano

Here's an interesting side note: After fixing my "Channel" thingy the
whole project behaved as expected. But there was an interesting hitch.
The main part revolves around another class, "Sequence", which has a
list of Channels as attribute. I was curious about the performance of my
script, because eventually this construct is supposed to handle
megabytes of data. So I wrote a simple loop that creates a new Sequence,
fills all the Channels with data, and repeats.

Interistingly, the first couple of dozens iterations went satisfactorily
quickly (took about 1 second total), but after a hundred or so times it
got really slow -- like a couple of seconds per iteration.

Playing around with the code, not really knowing what to do, I found
that in the "Sequence" class I had again erroneously declared a
class-level attribute -- rather harmlessly, just a string, that got
assigned to once in each iteration on object creation.

After I had deleted that, the loop went blindingly fast without slowing
down.

What's the mechanics behind this behavior?

Without actually seeing the code, it's difficult to be sure, but my guess
is that you were accidentally doing repeated string concatenation. This
can be very slow.

In general, anything that looks like this:

s = ''
for i in range(10000): # or any big number
s = s + 'another string'

can be slow. Very slow. The preferred way is to build a list of
substrings, then put them together in one go.

L = []
for i in range(10000):
L.append('another string')
s = ''.join(L)


It's harder to stumble across the slow behaviour these days, as Python
2.4 introduced an optimization that, under some circumstances, makes
string concatenation almost as fast as using join(). But be warned: join()
is still the recommended approach. Don't count on this optimization to
save you from slow code.

If you want to see just how slow repeated concatenation is compared to
joining, try this:

import timeit
t1 = timeit.Timer('for i in xrange(1000): x=x+str(i)+"a"', 'x=""')
t2 = timeit.Timer('"".join(str(i)+"a" for i in xrange(1000))', '')

t1.repeat(number=30) [0.8506159782409668, 0.80239105224609375, 0.73254203796386719]
t2.repeat(number=30)
[0.052678108215332031, 0.052067995071411133, 0.052803993225097656]

Concatenation is more than ten times slower in the example above, but it
gets worse:
t1.repeat(number=40) [1.5138671398162842, 1.5060651302337646, 1.5035550594329834]
t2.repeat(number=40)
[0.072292804718017578, 0.070636987686157227, 0.070624113082885742]

And even worse:
t1.repeat(number=50) [2.7190279960632324, 2.6910948753356934, 2.7089321613311768]
t2.repeat(number=50)
[0.087616920471191406, 0.088094949722290039, 0.087819099426269531]
 
B

Bruno Desthuilliers

Steven D'Aprano a écrit :
Without actually seeing the code, it's difficult to be sure, but my guess
is that you were accidentally doing repeated string concatenation. This
can be very slow.

In general, anything that looks like this:

s = ''
for i in range(10000): # or any big number
s = s + 'another string'

can be slow. Very slow.

But this is way faster:

s = ''
for i in range(10000): # or any big number
s += 'another string'


(snip)
It's harder to stumble across the slow behaviour these days, as Python
2.4 introduced an optimization that, under some circumstances, makes
string concatenation almost as fast as using join().

yeps : using augmented assignment (s =+ some_string) instead of
concatenation and rebinding (s = s + some_string).
But be warned: join()
is still the recommended approach. Don't count on this optimization to
save you from slow code.
>
If you want to see just how slow repeated concatenation is compared to
joining, try this:

import timeit
t1 = timeit.Timer('for i in xrange(1000): x=x+str(i)+"a"', 'x=""')
t2 = timeit.Timer('"".join(str(i)+"a" for i in xrange(1000))', '')

t1.repeat(number=30) [0.8506159782409668, 0.80239105224609375, 0.73254203796386719]
t2.repeat(number=30)
[0.052678108215332031, 0.052067995071411133, 0.052803993225097656]

Concatenation is more than ten times slower in the example above,

Not using augmented assignment:
>>> from timeit import Timer
>>> t1 = Timer('for i in xrange(1000): x+= str(i)+"a"', 'x=""')
>>> t2 = Timer('"".join(str(i)+"a" for i in xrange(1000))', '')
>>> t1.repeat(number=30) [0.07472991943359375, 0.064207077026367188, 0.064996957778930664]
>>> t2.repeat(number=30)
[0.071865081787109375, 0.061071872711181641, 0.06132817268371582]

(snip)
And even worse:
t1.repeat(number=50) [2.7190279960632324, 2.6910948753356934, 2.7089321613311768]
t2.repeat(number=50)
[0.087616920471191406, 0.088094949722290039, 0.087819099426269531]

Not that worse here:
>>> t1.repeat(number=50) [0.12305188179016113, 0.10764503479003906, 0.10605692863464355]
>>> t2.repeat(number=50) [0.11200308799743652, 0.10315108299255371, 0.10278487205505371]
>>>

I'd still advise using the sep.join(seq) approach, but not because of
performances.
 
S

Steven D'Aprano

But this is way faster:

s = ''
for i in range(10000): # or any big number
s += 'another string'

Actually, no, for two reasons:

(1) The optimizer works with both s = s+t and s += t, so your version is
no faster than mine.

(2) The optimization isn't part of the language. It only happens if you
are using CPython versions better than 2.4, and even then not guaranteed.

People forget that CPython isn't the language, it's just one
implementation of the language, like Jython and IronPython. Relying on
the optimization is relying on an implementation-specific trick.
yeps : using augmented assignment (s =+ some_string) instead of
concatenation and rebinding (s = s + some_string).

Both are equally optimized.
timeit.Timer('s+=t', 's,t="xy"').repeat(number=100000) [0.027187108993530273, 0.026471138000488281, 0.027689933776855469]
timeit.Timer('s=s+t', 's,t="xy"').repeat(number=100000)
[0.026300907135009766, 0.02638697624206543, 0.02637791633605957]

But here's a version without it:
[2.1038830280303955, 2.1027638912200928, 2.1031770706176758]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,078
Latest member
MakersCBDBlood

Latest Threads

Top