Creating a List of Empty Lists

F

Fuzzyman

Pythons internal 'pointers' system is certainly causing me a few
headaches..... When I want to copy the contents of a variable I find
it impossible to know whether I've copied the contents *or* just
created a new pointer to the original value....

For example I wanted to initialize a list of empty lists....

a=[ [], [], [], [], [] ]

I thought there has to be a *really* easy way of doing it - after a
bit of hacking I discovered that :
a = [[]]*10 appeared to work

however - using it in my program called bizarre crashes....
I eventually discovered that (as a silly example) :
a = [[]]*10
b=-1
while b < 10:
b += 1
a = b
print a

Produced :
[ [9], [9], [9]......

Which isn't at all what I intended...............
What is the correct, quick way of doing this (without using a loop and
appending...) ?

Fuzzyman



http://www.Voidspace.org.uk
The Place where headspace meets cyberspace. Online resource site -
covering science, technology, computing, cyberpunk, psychology,
spirituality, fiction and more.

---
http://www.atlantibots.org.uk
http://groups.yahoo.com/group/atlantis_talk/
Atlantibots - stomping across the worlds of Atlantis.
---
http://www.fuchsiashockz.co.uk
http://groups.yahoo.com/group/void-shockz
---

Everyone has talent. What is rare is the courage to follow talent
to the dark place where it leads. -Erica Jong
Ambition is a poor excuse for not having sense enough to be lazy.
-Milan Kundera
 
D

Duncan Booth

(e-mail address removed) (Fuzzyman) wrote in
I eventually discovered that (as a silly example) :
a = [[]]*10
b=-1
while b < 10:
b += 1
a = b
print a

Produced :
[ [9], [9], [9]......

Which isn't at all what I intended...............
What is the correct, quick way of doing this (without using a loop and
appending...) ?


The recommended way these days is usually:

a = [ [] for i in range(10) ]

That still has a loop and works by appending empty lists, but at least its
just a single expression. Also you can incorporate the next stage of your
initialisation quite easily:

a = [ for b in range(10) ]
 
D

Daniel Dittmar

Fuzzyman said:
Pythons internal 'pointers' system is certainly causing me a few
headaches..... When I want to copy the contents of a variable I find
it impossible to know whether I've copied the contents *or* just
created a new pointer to the original value....

For example I wanted to initialize a list of empty lists....

a=[ [], [], [], [], [] ] [...]
What is the correct, quick way of doing this (without using a loop and
appending...) ?
l = [ [] for i in xrange (3)]
l [[], [], []]
l [0].append ('a')
l
[['a'], [], []]

Daniel
 
A

Anton Vredegoor

Pythons internal 'pointers' system is certainly causing me a few
headaches..... When I want to copy the contents of a variable I find
it impossible to know whether I've copied the contents *or* just
created a new pointer to the original value....

For example I wanted to initialize a list of empty lists....

a=[ [], [], [], [], [] ]

I thought there has to be a *really* easy way of doing it - after a
bit of hacking I discovered that :
a = [[]]*10 appeared to work

however - using it in my program called bizarre crashes....
I eventually discovered that (as a silly example) :
a = [[]]*10
b=-1
while b < 10:
b += 1
a = b
print a

Produced :
[ [9], [9], [9]......

Which isn't at all what I intended...............
What is the correct, quick way of doing this (without using a loop and
appending...) ?


Here it produced an IndexError. After changing "b < 10" into "b < 9"
the code produced:

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

I see some other posters already gave you the answer. I'll do
something different and give you the question :)

n = 4
agents = [[]]*n
print agents
agents[0].append('Smith')
print agents
neos = map(list,[[]]*n)
print neos
neos[0].append('Neo')
print neos

output is:

[[], [], [], []]
[['Smith'], ['Smith'], ['Smith'], ['Smith']]
[[], [], [], []]
[['Neo'], [], [], []]

The question is:

Why is "Smith" copied to all elements in the matrix?

(or is that another movie :)

Anton
 
S

Samuel Tardieu

Anton> Why is "Smith" copied to all elements in the matrix?

:)

The construct

[[]] * n

gives you a list with n references to the same list. When you modify
one of the elements, all the references will see the changes.

You can use one of the three (there are more)

([[]]*n)[:]
[[] for _ in range(n)]
map (lambda _: [], range(n))

to get n different copies of [].

Sam
 
E

Erik Max Francis

Samuel said:
You can use one of the three (there are more)

([[]]*n)[:]

This won't work. [:] makes a shallow copy. This will make a different
containing list, but the contained lists will still be identical:
s = [[]]*10
s = [[]]*5
t = s[:]
id(s) 1076779980
id(t) 1076780300
map(lambda (x, y): x is y, zip(s, t)) [True, True, True, True, True]
s[0].append(2)
s [[2], [2], [2], [2], [2]]
t
[[2], [2], [2], [2], [2]]

--
Erik Max Francis && (e-mail address removed) && http://www.alcyone.com/max/
__ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
/ \
\__/ I'm sharing the joy / I'm glowing like sunshine
-- Chante Moore
 
F

Francis Avila

Fuzzyman wrote in message
Pythons internal 'pointers' system is certainly causing me a few
headaches..... When I want to copy the contents of a variable I find
it impossible to know whether I've copied the contents *or* just
created a new pointer to the original value....


You don't need to divine the rules for copy vs. reference: Python NEVER
copies values, ONLY dereferences names, unless you EXPLICITLY ask for a copy
(which Python-the-language doesn't have any special machinery for; you have
to use the copy module or figure out how to copy the object yourself.)

It would help if you stopped thinking in terms of variables and pointers (as
in C) and thought instead in terms of names and objects. Python doesn't
have variables in the C sense, where the variable name stands for its value
ONLY up to the compilation stage, at which time the variable names cease to
exist. Rather, in Python, a "variable" is a name which points to an object.
This behavior is implemented as a key:value pair in a dictionary (i.e.,
__dict__), where the key is a string holding the name and the value is the
object itself. A Python "pointer" is a sort of indexed name, like C arrays;
hence the square-bracket syntax. However, even though the name is strictly
speaking unnamed (i.e., there is no corresponding string object in the
namespace dictionary), yet it is still purely a referent to a real object,
and not a real object itself.

Another way to interpret "pointer" in a Pythonic framework is to say its a
weak-reference: i.e., a reference to an object which does not increase that
object's reference count. However, there is no clear correspondence between
C's "variable/pointer" concepts and what Python does, only analogy.

When no names point to a given object, that object is a candidate for
garbage collection.

All these concepts are illustrated in the following two lines:
[[], []]

This means:

- Create an empty list.
- Create a list which references that empty list.
- Create an integer '2'. (This step may be optimized away by pre-created
integer objects; pre-creating an object is called "interning" it--these
objects are immortal. You can make one with the intern() builtin.)
- Call the __mul__ method of the one-element list, with an argument of 2.
- The __mul__ method creates a new list, and inserts references to its own
elements into this new list--twice. References/names can *only* be copied
(by definition), not pointed to.
- __mul__ then returns the new list.
- This new list is not bound to any name, and so the object is impossible to
access. It is now a candidate for garbage collection.

So, in this one line, four objects were created and destroyed, not five.

The following behavior should now make sense:
L = [[]]*2
L [[], []]
L[0].append(1)
L
[[1], [1]]

L[0] and L[1] are two different names, but the object they both point to has
been modified. However:
[1, [1]]

Here, the name L[0] was made to point to a new object, the integer 1.

The only mutable objects you usually have to worry about are dicts and
lists.
 
F

Fuzzyman

Francis Avila said:
Fuzzyman wrote in message



You don't need to divine the rules for copy vs. reference: Python NEVER
copies values, ONLY dereferences names, unless you EXPLICITLY ask for a copy
(which Python-the-language doesn't have any special machinery for; you have
to use the copy module or figure out how to copy the object yourself.)
[snip...] Interesting discussion reluctantly snipped.......
The only mutable objects you usually have to worry about are dicts and
lists.

Right - because if I do something like :

a = 4
b = a
a = 5
print b

It prints 4... not 5.
In other words - the line b = a creates a name pointing to the object
4, rather than a name pointing to the contents of a.....

I think I see what you mean - since the object 4 is immutable......
the line a = 5 destroys the old name a and creates a new one pointing
to object 5, rather than changing what the name a is pointing to.

Since lists and dictionaries are mutable..... changing the contents
modifies the object rather than destroying the refereence tothe old
one and creating a new one.....

Hmmmmmm... thanks.........

Fuzzy
 
A

Andrew Koenig

Right - because if I do something like :

a = 4
b = a
a = 5
print b

It prints 4... not 5.
In other words - the line b = a creates a name pointing to the object
4, rather than a name pointing to the contents of a.....

There's no difference between "the object 4" and "the contents of a", so the
"rather than" makes no sense in this context.

After executing

b = a

the names "a" and "b" refer to the same object.
 
R

r.e.s.

...
It's more because the `a` in `a=` is simply a name, and
the rule for assigning to a name is to forget the previous
binding (if any) and create a new binding.

[beginner here]

Ok, I thought I understood that, but then I was reading
about some special named objects being pre-defined
(e.g. `0` through `99`), so I tried the following (using
`is`, which I understand compares the identities of the
objects whose names are given to it):
False

.... so far so good, and then experimenting:
False

.... which, like the result for `100`, I *thought* was
explained like this:

When the interpreter encounters the `[]` on the RHS,
there is no object named `[]`, so it creates one and
binds it to the name `a` (forgetting the old binding
for `a`). Then on the second line, when the `[]` is
encountered, again there's no object with that name,
so a new one is created and found to be in a location
different than the one just named `a`.

But something's wrong with that explanation, because
of the following:
True

Surely `'gobble'` is not the name of an already-existing
object, so I expected exactly the same result as for
`100` and `[]`. What's going on there?

(`[]` and `'gobble'` differ with respect to mutability,
but how does that explain the different result (if it
does))?

Thanks.
 
D

David M. Cooke

r.e.s. said:
But something's wrong with that explanation, because
of the following:
True

Surely `'gobble'` is not the name of an already-existing
object, so I expected exactly the same result as for
`100` and `[]`. What's going on there?

Actually, your second use of 'gobble' *is* an already-existing object.
Python 'interns' string constants, i.e., reuses them. Check this outFalse

Here, the constants 'gobble', 'gob', and 'ble' are interned. Computed
strings are not looked up and replaced, so 'gob' + 'ble' is not
'gobble'.

Compare with (in a new interpreter!)
True

Here, we've explicitly interned the result of 'gob' + 'ble', so it is
the same as 'gobble'.

But, don't depend on interning. Use == to compare strings.
 
F

Francis Avila

Fuzzyman wrote in message
Right - because if I do something like :

a = 4
b = a
a = 5
print b

It prints 4... not 5.
In other words - the line b = a creates a name pointing to the object
4, rather than a name pointing to the contents of a.....


Correct, but it's not so much the contents of 'a', but the thing that 'a'
points to. Thinking of names as having a "contents" is bound to lead to
confusion later. In any case, as far as 'b' is concerned, there's nothing
in the universe besides itself and the object it points to. It doesn't know
anything about the existence of 'a'. Python sees what object the right side
of the assignment operator points to, and then tells the names on the left
side, "Hey, you! Point to this object!"

Names are not objects, and names point only to objects, not to names. This
doesn't mean that names aren't *things*: they certainly are, because objects
can contain names.

This is more natural than you might think, because you do it in language.
When you say "chair", the sound points to the concept "chair". The sound
isn't a chair, nor does it contain a chair, but when someone else hears the
sound, it evokes the same "chair" concept, and not the sound itself.
However, the name is still a *thing* (you can speak it, put it in a book,
etc.), even though its existence is transparent.
I think I see what you mean - since the object 4 is immutable......
the line a = 5 destroys the old name a and creates a new one pointing
to object 5, rather than changing what the name a is pointing to.


It doesn't create or destroy the old name: it just makes it point to
something else. Names are created by '=' if the name didn't exist before,
and destroyed by 'del' or by going out of scope.
Hmmmmmm... thanks.........


No problem. I think you've got the hang of things.
 
R

r.e.s.

r.e.s. said:
But something's wrong with that explanation, because
of the following:
a = 'gobble'
a is 'gobble'
True

Surely `'gobble'` is not the name of an already-existing
object, so I expected exactly the same result as for
`100` and `[]`. What's going on there?

Actually, your second use of 'gobble' *is* an already-existing object.
Python 'interns' string constants, i.e., reuses them.
<snip>

Thanks. That explains it.
 
R

Robin Munn

r.e.s. said:
r.e.s. said:
But something's wrong with that explanation, because
of the following:

a = 'gobble'
a is 'gobble'
True

Surely `'gobble'` is not the name of an already-existing
object, so I expected exactly the same result as for
`100` and `[]`. What's going on there?

Actually, your second use of 'gobble' *is* an already-existing object.
Python 'interns' string constants, i.e., reuses them.
<snip>

Thanks. That explains it.

But take note that this behavior is not guaranteed anywhere in the language
reference. As someone else said in this thread, don't count on interning.
Sometimes it will happen, and sometimes it won't:
False

The rule of thumb is that Python interns a string if it's likely to be used
as a name (i.e., only alphanumerics and underscores). The string 'foo bar'
has a space and would be an invalid name, so it wasn't interned.
 
R

Robin Becker

The recommended way these days is usually:

a = [ [] for i in range(10) ]

That still has a loop and works by appending empty lists, but at least its
just a single expression. Also you can incorporate the next stage of your
initialisation quite easily:

a = [ for b in range(10) ]

I seem to remember the fastest way to do this was map(list,n*[[]]) from
a couple of earlier threads, but is that true in 2.3?
 
E

Emile van Sebille

Robin Munn:
But take note that this behavior is not guaranteed anywhere in the language
reference. As someone else said in this thread, don't count on interning.
Sometimes it will happen, and sometimes it won't:

False

The rule of thumb is that Python interns a string if it's likely to be used
as a name (i.e., only alphanumerics and underscores). The string 'foo bar'
has a space and would be an invalid name, so it wasn't interned.

And don't think you can get around it using intern():

Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on
win32False

That apparent space requirement should really be better documented.
True

From the docs on intern():
"Interning strings is useful to gain a little performance on
dictionary lookup "

and while it continues:
"...names used in Python programs are automatically interned, and
the dictionaries used to hold module, class or instance attributes
have interned keys"

It probably should specifically state that only valid identifiers will
be intern()'d and optionally return an error or warning otherwise.



Emile van Sebille
(e-mail address removed)
 
S

Skip Montanaro

Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32 False

Emile> That apparent space requirement should really be better documented.

The fact that the current implementation of CPython automatically interns
strings which look like identifiers is simply an efficiency consideration.
It's not part of the language definition, so doesn't bear documenting. The
correct way to compare two strings is using '==' (which is independent of
CPython's implementation details), not 'is'.

Skip
 
F

Francis Avila

Skip Montanaro wrote in message ...
The
correct way to compare two strings is using '==' (which is independent of
CPython's implementation details), not 'is'.


I think it's better to say that the only time you *use* 'is' is when you
*know* the object you're comparing to is a singleton (None, True, or False).

Every other time, use '=='. And don't be deceived by 'is' working
sometimes.
 
E

Emile van Sebille

Skip Montanaro:
[ quoting me ]
Python 2.3 (#46, Jul 29 2003, 18:54:32) [MSC v.1200 32 bit (Intel)] on win32False

Emile> That apparent space requirement should really be better documented.

The fact that the current implementation of CPython automatically interns
strings which look like identifiers is simply an efficiency consideration.
It's not part of the language definition, so doesn't bear documenting. The
correct way to compare two strings is using '==' (which is independent of
CPython's implementation details), not 'is'.

Skip

OK. But does intern() intern? I (thought I) only used is to show
that it wasn't intern()'d, and as the documentation holds intern'ing
up as an optimization technique, does it _only_ apply to strings that
look like identifiers? How else might you know?

Emile van Sebille
(e-mail address removed)
 
S

Skip Montanaro

Emile> But does intern() intern?

Yes, I believe it does:
1

Emile> does it _only_ apply to strings that look like identifiers?

Automatic interning, yes. At the point where the system has to decide
whether to automatically intern a string it knows nothing about the higher
level context in which the string appears. The optimization was added
because the strings which are manipulated most often by the interpreter
happen to be those which represent the program's identifiers (object
attributes, variables, etc). Consequently, the decision was made some time
ago to simply intern all strings which look like identifiers (as the snippet
above shows). It's a bit like driving a tack with a sledge hammer, but it
does improve interpreter performance.

Emile> How else might you know?

It doesn't really matter. The intern() function is available to programmers
who want to conciously optimize their string handling and is properly
documented. Automatic interning is not an optimization aimed at the
programmer, but at the internals of the interpreter. It's just a side
effect of the crude optimization that if you happen to use a string literal
in your program which looks like an identifier, it's interned
automatically. Look at is as a freebie. ;-)

Skip
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top