Best way to create a copy of a list

F

Frank Millman

Hi all

Assume a 2-dimensional list called 'table' - conceptually think of it
as rows and columns.

Assume I want to create a temporary copy of a row called 'row',
allowing me to modify the contents of 'row' without modifying the
contents of 'table'.

I used to fall into the newbie trap of 'row = table[23]', but I have
learned my lesson by now - changing 'row' also changes 'table'.

I have found two ways of doing it that seem to work.

1 - row = table[23][:]

2 - row = []
row[:] = table[23]

Are these effectively identical, or is there a subtle distinction which
I should be aware of.

I did some timing tests, and 2 is quite a bit faster if 'row'
pre-exists and I just measure the second statement.

TIA

Frank Millman
 
R

Rune Strand

Frank said:
Hi all

Assume a 2-dimensional list called 'table' - conceptually think of it
as rows and columns.

Assume I want to create a temporary copy of a row called 'row',
allowing me to modify the contents of 'row' without modifying the
contents of 'table'.

I used to fall into the newbie trap of 'row = table[23]', but I have
learned my lesson by now - changing 'row' also changes 'table'.

I have found two ways of doing it that seem to work.

1 - row = table[23][:]

2 - row = []
row[:] = table[23]

Are these effectively identical, or is there a subtle distinction which
I should be aware of.

I did some timing tests, and 2 is quite a bit faster if 'row'
pre-exists and I just measure the second statement.


you could use list()

row = list(table[23])

The effect is the same, but it's nicer to read.
See also the copy module.
 
F

Fredrik Lundh

Frank said:
I have found two ways of doing it that seem to work.

1 - row = table[23][:]

2 - row = []
row[:] = table[23]

Are these effectively identical, or is there a subtle distinction which
I should be aware of.

I did some timing tests, and 2 is quite a bit faster if 'row'
pre-exists and I just measure the second statement.

quite a bit ? maybe if you're using very short rows, and all rows
have the same length, but hardly in the general case:

python -mtimeit -s "data=[range(100)]*100; row = []" "row[:] = data[23]"
100000 loops, best of 3: 5.35 usec per loop

python -mtimeit -s "data=[range(100)]*100" "row = data[23][:]"
100000 loops, best of 3: 4.81 usec per loop

(for constant-length rows, the "row[:]=" form saves one memory
allocation, since the target list can be reused as is. for longer rows,
other things seem to dominate)

</F>
 
F

Frank Millman

Fredrik said:
Frank said:
I have found two ways of doing it that seem to work.

1 - row = table[23][:]

2 - row = []
row[:] = table[23]

Are these effectively identical, or is there a subtle distinction which
I should be aware of.

I did some timing tests, and 2 is quite a bit faster if 'row'
pre-exists and I just measure the second statement.

quite a bit ? maybe if you're using very short rows, and all rows
have the same length, but hardly in the general case:

python -mtimeit -s "data=[range(100)]*100; row = []" "row[:] = data[23]"
100000 loops, best of 3: 5.35 usec per loop

python -mtimeit -s "data=[range(100)]*100" "row = data[23][:]"
100000 loops, best of 3: 4.81 usec per loop

(for constant-length rows, the "row[:]=" form saves one memory
allocation, since the target list can be reused as is. for longer rows,
other things seem to dominate)

</F>

Interesting. My results are opposite.

python -mtimeit -s "data=[range(100)]*100; row = []" "row[:] =
data[23]"
100000 loops, best of 3: 2.57 usec per loop

python -mtimeit -s "data=[range(100)]*100" "row = data[23][:]"
100000 loops, best of 3: 2.89 usec per loop

For good measure, I tried Rune's suggestion -

python -mtimeit -s "data=[range(100)]*100" "row = list(data[23])"
100000 loops, best of 3: 3.69 usec per loop

For practical purposes these differences are immaterial - I do not
anticipate huge quantities of data.

If they are all equivalent from a functional point of view, I lean
towards the second version. I agree with Rune that the third one is
nicer to read, but somehow the [:] syntax makes it a bit more obvious
what is going on.

Thanks

Frank
 
J

Jorge Godoy

Frank Millman said:
Interesting. My results are opposite.

I got the same here (cPython 2.4.1):

godoy@jupiter ~ % python -mtimeit -s "data=[range(100)]*100; row = []" "row[:] = data[23]"
1000000 loops, best of 3: 1.15 usec per loop
godoy@jupiter ~ % python -mtimeit -s "data=[range(100)]*100" "row = data[23][:]"
100000 loops, best of 3: 1.42 usec per loop
godoy@jupiter ~ % python -mtimeit -s "data=[range(100)]*100" "row = list(data[23])"
100000 loops, best of 3: 1.93 usec per loop
godoy@jupiter ~ %
If they are all equivalent from a functional point of view, I lean
towards the second version. I agree with Rune that the third one is
nicer to read, but somehow the [:] syntax makes it a bit more obvious
what is going on.

I prefer the third option for readability. It makes it clear that I'll get a
*new* list with the 23rd row of data. Just think: how would you get the 1st
column of the 23rd row?
a = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
a [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]]
a[1] [2, 3]
a[1][1] 3
a[1][:] [2, 3]

Someone might think that the "[:]" means "all columns" and the syntax to be
equivalent to "data[23]".


--
Jorge Godoy <[email protected]>

"Quidquid latine dictum sit, altum sonatur."
- Qualquer coisa dita em latim soa profundo.
- Anything said in Latin sounds smart.
 
A

Alex Martelli

Frank Millman said:
If they are all equivalent from a functional point of view, I lean
towards the second version. I agree with Rune that the third one is
nicer to read, but somehow the [:] syntax makes it a bit more obvious
what is going on.

I vastly prefer to call list(xxx) in order to obtain a new list with the
same items as xxx -- couldn't be more obvious than that.

You can't claim it's obvious that xxx[:] *copies* data -- because in
Numeric, for example, it doesn't, it returns an array that *shares* data
with xxx. So, the [:] notation sometimes copies and sometimes does not,
list list(...) always copies -- if I want to ensure that a copy does
happen, then list(...) is the more obvious and readable choice.


Alex
 
P

Paul Rubin

I vastly prefer to call list(xxx) in order to obtain a new list with the
same items as xxx -- couldn't be more obvious than that.

You can't claim it's obvious that xxx[:] *copies* data

Heh, it wasn't obvious that list(xxx) copies data either (I thought of
it as being like a typecast), but I just checked, and it does copy.
I'll have to remember to do it like that. I do like it better than
xxx[:] which is what I'd been using because I remember seeing that the
copy module does it that way.
 
F

Frank Millman

Alex said:
Frank Millman said:
If they are all equivalent from a functional point of view, I lean
towards the second version. I agree with Rune that the third one is
nicer to read, but somehow the [:] syntax makes it a bit more obvious
what is going on.

I vastly prefer to call list(xxx) in order to obtain a new list with the
same items as xxx -- couldn't be more obvious than that.

You can't claim it's obvious that xxx[:] *copies* data -- because in
Numeric, for example, it doesn't, it returns an array that *shares* data
with xxx. So, the [:] notation sometimes copies and sometimes does not,
list list(...) always copies -- if I want to ensure that a copy does
happen, then list(...) is the more obvious and readable choice.


Alex

Thanks very much for the detailed explanation.

Frank
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top