Confounded by Python objects

boblatest · Jul 24, 2008

Hello group,

take a look at the code snippet below. What I want to do is initialize
two separate Channel objects and put some data in them. However,
although two different objects are in fact created (as can be seen
from the different names they spit out with the "diag()" method), the
data in the "sample" member is the same although I stick different
values in.

Thanks,
robert
(Sorry for Google Groups, but I don't have NNTP at work)

Here's the code:

#!/usr/bin/python

class Channel:
name = ''
sample = []

def __init__(self, name):
self.name = name

def append(self, time, value):
self.sample.append((time, value))
self.diag()

def diag(self):
print (self.name, self.sample)

chA = Channel('A')
chB = Channel('B')

chA.append(1, 1.1)
chB.append(2, 2.1)
chA.append(3, 1.2)
chB.append(4, 2.2)

print 'Result:'

chA.diag()
chB.diag()

------------------------------------
and here's the output:

('A', [(1, 1.1000000000000001)])
('B', [(1, 1.1000000000000001), (2, 2.1000000000000001)])
('A', [(1, 1.1000000000000001), (2, 2.1000000000000001), (3, 1.2)])
('B', [(1, 1.1000000000000001), (2, 2.1000000000000001), (3, 1.2), (4,
2.2000000000000002)])
Result:
('A', [(1, 1.1000000000000001), (2, 2.1000000000000001), (3, 1.2), (4,
2.2000000000000002)])
('B', [(1, 1.1000000000000001), (2, 2.1000000000000001), (3, 1.2), (4,
2.2000000000000002)])

What I'd like to see, however, is 2 tuples per Channel object, like
this:
('A', [(1, 1.1000000000000001)])
('B', [(2, 2.1000000000000001)])
('A', [(1, 1.1000000000000001), (3, 1.2)])
('B', [(2, 2.1000000000000001), (4, 2.2000000000000002)])
Result:
('A', [(1, 1.1000000000000001), (3, 1.2)])
('B', [(2, 2.1000000000000001), (4, 2.2000000000000002)])

Fredrik Lundh · Jul 24, 2008

take a look at the code snippet below. What I want to do is initialize
two separate Channel objects and put some data in them. However,
although two different objects are in fact created (as can be seen
from the different names they spit out with the "diag()" method), the
data in the "sample" member is the same although I stick different
values in.

that's because you only have one sample object -- the one owned by
the class object.

since you're modifying that object in place (via the append method),
your changes will be shared by all instances. python never copies
attributes when it creates an instance; if you want a fresh object,
you have to create it yourself.

class Channel:

tip: if you're not 100% sure why you would want to put an attribute

def __init__(self, name):
self.name = name

self.sample = [] # create fresh container for instance

def append(self, time, value):
self.sample.append((time, value))
self.diag()

def diag(self):
print (self.name, self.sample)

hope this helps!

</F>

alex23 · Jul 24, 2008

class Channel:
name = ''
sample = []

def __init__(self, name):
self.name = name

def append(self, time, value):
self.sample.append((time, value))
self.diag()

def diag(self):
print (self.name, self.sample)

Okay, the problem is you're appending to a _class_ attribute, not to
an instance attribute.

If you change your class definition to this, it should work:

class Channel:
def __init__(self, name):
self.name = name
self.sample = []

That will provide each instance with its own version of the sample
attribute.

The 'self.name = name' in the __init__ for your original code binds a
new attribute to the instance, whereas 'self.sample.append(...' in the
class's append was appending to the class attribute instead.

Hope this helps.

- alex23

Lawrence D'Oliveiro · Jul 24, 2008

In message

class Channel:
name = ''
sample = []

These are class variables, not instance variables. Take them out, and ...

def __init__(self, name):
self.name = name

.... add this line to the above function

self.sample = []

boblatest · Jul 24, 2008

tip: if you're not 100% sure why you would want to put an attribute
on the class level, don't do it.

The reason I did it was sort of C++ish (that's where I come from): I
somehow wanted a list of attributes on the class level. More for
readibility than anything elase, really.

hope this helps!

Yup, did the trick. Thanks!
robert

satoru · Jul 25, 2008

The reason I did it was sort of C++ish (that's where I come from): I
somehow wanted a list of attributes on the class level. More for
readibility than anything elase, really.

Yup, did the trick. Thanks!
robert

yes, i thought your code is kind of static, so it didn't work for a
dynamic language like python.
in python, you don't have to say "static" to make an variable a class
variable, so the "name" and "sample" you kind of "declared" is indeed
class variables.
you may wonder why then the two instaces of "Channel" has different
names, that's because you assign to name in "__init__" and make it an
instance variable that shared the name "name" with a class variable.
As to "sample", it never get assigned to and when you say "append" the
class variable is changed in place.
hope my explaination helps.

Robert Latest · Jul 26, 2008

satoru said:
As to "sample", it never get assigned to and when you say "append" the
class variable is changed in place.
hope my explaination helps.

Sure does, thanks a lot.

Here's an interesting side note: After fixing my "Channel" thingy the
whole project behaved as expected. But there was an interesting hitch.
The main part revolves around another class, "Sequence", which has a
list of Channels as attribute. I was curious about the performance of my
script, because eventually this construct is supposed to handle
megabytes of data. So I wrote a simple loop that creates a new Sequence,
fills all the Channels with data, and repeats.

Interistingly, the first couple of dozens iterations went satisfactorily
quickly (took about 1 second total), but after a hundred or so times it
got really slow -- like a couple of seconds per iteration.

Playing around with the code, not really knowing what to do, I found
that in the "Sequence" class I had again erroneously declared a class-level
attribute -- rather harmlessly, just a string, that got assigned to once in each
iteration on object creation.

After I had deleted that, the loop went blindingly fast without slowing
down.

What's the mechanics behind this behavior?

robert

Steven D'Aprano · Jul 27, 2008

Here's an interesting side note: After fixing my "Channel" thingy the
whole project behaved as expected. But there was an interesting hitch.
The main part revolves around another class, "Sequence", which has a
list of Channels as attribute. I was curious about the performance of my
script, because eventually this construct is supposed to handle
megabytes of data. So I wrote a simple loop that creates a new Sequence,
fills all the Channels with data, and repeats.

Interistingly, the first couple of dozens iterations went satisfactorily
quickly (took about 1 second total), but after a hundred or so times it
got really slow -- like a couple of seconds per iteration.

Playing around with the code, not really knowing what to do, I found
that in the "Sequence" class I had again erroneously declared a
class-level attribute -- rather harmlessly, just a string, that got
assigned to once in each iteration on object creation.

After I had deleted that, the loop went blindingly fast without slowing
down.

What's the mechanics behind this behavior?

Without actually seeing the code, it's difficult to be sure, but my guess
is that you were accidentally doing repeated string concatenation. This
can be very slow.

In general, anything that looks like this:

s = ''
for i in range(10000): # or any big number
s = s + 'another string'

can be slow. Very slow. The preferred way is to build a list of
substrings, then put them together in one go.

L = []
for i in range(10000):
L.append('another string')
s = ''.join(L)

It's harder to stumble across the slow behaviour these days, as Python
2.4 introduced an optimization that, under some circumstances, makes
string concatenation almost as fast as using join(). But be warned: join()
is still the recommended approach. Don't count on this optimization to
save you from slow code.

If you want to see just how slow repeated concatenation is compared to
joining, try this:

import timeit
t1 = timeit.Timer('for i in xrange(1000): x=x+str(i)+"a"', 'x=""')
t2 = timeit.Timer('"".join(str(i)+"a" for i in xrange(1000))', '')

t1.repeat(number=30) [0.8506159782409668, 0.80239105224609375, 0.73254203796386719]
t2.repeat(number=30)

Click to expand...

Click to expand...

[0.052678108215332031, 0.052067995071411133, 0.052803993225097656]

Concatenation is more than ten times slower in the example above, but it
gets worse:

t1.repeat(number=40) [1.5138671398162842, 1.5060651302337646, 1.5035550594329834]
t2.repeat(number=40)

Click to expand...

Click to expand...

[0.072292804718017578, 0.070636987686157227, 0.070624113082885742]

And even worse:

t1.repeat(number=50) [2.7190279960632324, 2.6910948753356934, 2.7089321613311768]
t2.repeat(number=50)

Click to expand...

Click to expand...

[0.087616920471191406, 0.088094949722290039, 0.087819099426269531]

Bruno Desthuilliers · Jul 27, 2008

Steven D'Aprano a Ã©crit :

Without actually seeing the code, it's difficult to be sure, but my guess
is that you were accidentally doing repeated string concatenation. This
can be very slow.

In general, anything that looks like this:

s = ''
for i in range(10000): # or any big number
s = s + 'another string'

can be slow. Very slow.

But this is way faster:

s = ''
for i in range(10000): # or any big number
s += 'another string'

(snip)

It's harder to stumble across the slow behaviour these days, as Python
2.4 introduced an optimization that, under some circumstances, makes
string concatenation almost as fast as using join().

yeps : using augmented assignment (s =+ some_string) instead of
concatenation and rebinding (s = s + some_string).

But be warned: join()
is still the recommended approach. Don't count on this optimization to
save you from slow code.
>
If you want to see just how slow repeated concatenation is compared to
joining, try this:

import timeit
t1 = timeit.Timer('for i in xrange(1000): x=x+str(i)+"a"', 'x=""')
t2 = timeit.Timer('"".join(str(i)+"a" for i in xrange(1000))', '')

t1.repeat(number=30) [0.8506159782409668, 0.80239105224609375, 0.73254203796386719]
t2.repeat(number=30)

Click to expand...

Click to expand...

[0.052678108215332031, 0.052067995071411133, 0.052803993225097656]

Concatenation is more than ten times slower in the example above,

Not using augmented assignment:

>>> from timeit import Timer
>>> t1 = Timer('for i in xrange(1000): x+= str(i)+"a"', 'x=""')
>>> t2 = Timer('"".join(str(i)+"a" for i in xrange(1000))', '')
>>> t1.repeat(number=30) [0.07472991943359375, 0.064207077026367188, 0.064996957778930664]
>>> t2.repeat(number=30)

Click to expand...

Click to expand...

[0.071865081787109375, 0.061071872711181641, 0.06132817268371582]

(snip)

And even worse:

t1.repeat(number=50) [2.7190279960632324, 2.6910948753356934, 2.7089321613311768]
t2.repeat(number=50)

Click to expand...

Click to expand...

[0.087616920471191406, 0.088094949722290039, 0.087819099426269531]

Not that worse here:

>>> t1.repeat(number=50) [0.12305188179016113, 0.10764503479003906, 0.10605692863464355]
>>> t2.repeat(number=50) [0.11200308799743652, 0.10315108299255371, 0.10278487205505371]
>>>

Click to expand...

Click to expand...

I'd still advise using the sep.join(seq) approach, but not because of
performances.

Steven D'Aprano · Jul 27, 2008

But this is way faster:

s = ''
for i in range(10000): # or any big number
s += 'another string'

Actually, no, for two reasons:

(1) The optimizer works with both s = s+t and s += t, so your version is
no faster than mine.

(2) The optimization isn't part of the language. It only happens if you
are using CPython versions better than 2.4, and even then not guaranteed.

People forget that CPython isn't the language, it's just one
implementation of the language, like Jython and IronPython. Relying on
the optimization is relying on an implementation-specific trick.

yeps : using augmented assignment (s =+ some_string) instead of
concatenation and rebinding (s = s + some_string).

Both are equally optimized.

timeit.Timer('s+=t', 's,t="xy"').repeat(number=100000) [0.027187108993530273, 0.026471138000488281, 0.027689933776855469]
timeit.Timer('s=s+t', 's,t="xy"').repeat(number=100000)

Click to expand...

Click to expand...

[0.026300907135009766, 0.02638697624206543, 0.02637791633605957]

But here's a version without it:
[2.1038830280303955, 2.1027638912200928, 2.1031770706176758]

Python battle game help	2	Feb 23, 2023
ANN: csvutils 0.1	0	Dec 15, 2007
Minimum Total Difficulty	0	Nov 15, 2023
Using python recursion to calculate the Parenthesis part not working	4	Feb 5, 2023
How to return data in specific format from Python Flask API?	0	Aug 10, 2022
Python point location of intersect between two lines	0	Feb 28, 2018
Variable class name in python	0	Apr 25, 2022
SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023

Confounded by Python objects

boblatest

Fredrik Lundh

alex23

Lawrence D'Oliveiro

boblatest

satoru

Robert Latest

Steven D'Aprano

Bruno Desthuilliers

Steven D'Aprano

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads