why are some types immutable?

T

Torsten Mohr

Hi,

reading the documentation (and also from a hint from this NG)
i know now that there are some types that are not mutable.

But why is it this way?

From an overhead point of view i think it is not optimal,
for example for a large string it could be much faster to
have it changed in place, not generating a new one for
every step in a change.


Thanks for any hints,
Torsten.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Torsten said:
reading the documentation (and also from a hint from this NG)
i know now that there are some types that are not mutable.

But why is it this way?

There are various reasons, some apply for some types, and
some for others:
- immutable objects are hashable - their hash value will not
change during their life time (if it is a container, the
contained objects also need to be hashable). Mutable
types are typically not hashable.
A type needs to be hashable to act as a dictionary key.
- immutable objects can be shared, allowing the same
reference to be used in multiple places. For mutable
objects, one often needs to make a copy of the object
before storing it.
- immutable types can often be implemented more efficiently.
For examples, the length of a tuple is fixed, as tuples
are immutable. Therefore, memory management for tuples
is simpler (they only require one memory block, instead
of two, as lists do).
- for some types, the expectation of immutability is so
common that people would complain massively if they were
mutable. This, in particular, applies to numbers - people
expect that after
x = 7
the variable x keeps the value, and that the "7" object
cannot suddenly change its value to 8.
From an overhead point of view i think it is not optimal,
for example for a large string it could be much faster to
have it changed in place, not generating a new one for
every step in a change.

For strings, the following points apply; as a net result,
the overhead is *less* than it would be if strings were
mutable:
- strings must be hashable, so they can be used as
a dictionary key. If you would modify, say,
len.func_name, then newly imported code could not
find the len function anymore.
It would be possible to solve this by introducing
an additional immutable identifier type, but then
you would need to create copies of strings (to
create identifier object). Many applications might
break which currently use strings as dictionary
keys.
- strings are currently assumed to be shareable.
You pass strings to functions which then store
the strings in global variables, object attributes,
etc. In all these places, copies would need to be
made if strings were mutable, since the caller
might chose to modify the string afterwards.
- immutable strings consume less memory than
mutable strings would, because a string is made
up of just one memory block.

Regards,
Martin
 
R

Roy Smith

Torsten Mohr said:
reading the documentation (and also from a hint from this NG)
i know now that there are some types that are not mutable.

But why is it this way?

From an overhead point of view i think it is not optimal,
for example for a large string it could be much faster to
have it changed in place, not generating a new one for
every step in a change.

There has been a huge amount written about immutable types recently. A
search of the Google news archives should turn up tons of articles.

But, in a nutshell, the biggest reason for immutable types (tuples and
strings) is that this lets they be dictionary keys. If you want to know
why dictionary keys have to be immutable, do that google search; there
was a huge thread on exactly that subject just in the path month or two.

Making strings immutable has some good points (in addition to making
them usable as dictionary keys) and bad points. On the good side, it
lets you intern strings and share memory. For example:
3587520

when the interpreter saw the string "fred" in the second assignment
statement, it looked it up and discovered that it already had a string
with the value "fred", so it just re-used the same string (which is why
they both have the same id). This saves memory (which can become
important when you have a lot of strings). It also makes string
comparison more efficient, since a test for equality can often be
satisfied by just comparing id's.

The bad side is that, as you mentioned, things like:

s = ""
for c in getCharacters():
s = s + c

become very inefficient: O(n^2). Fortunately, there is an easy idiom to
avoid that quadratic behavior. You just accumulate the characters in a
list, O(n), and then convert the list to a string with join(), also
O(n). Two O(n) passes is a heck of a lot better than a single O(n^2).

l = []
for c in getCharacters():
l.append (c)
s = "".join (l)

Other types of in-place string manipulation could use the same trick.
 
R

Roy Smith

Martin v. Löwis said:
- for some types, the expectation of immutability is so
common that people would complain massively if they were
mutable. This, in particular, applies to numbers - people
expect that after
x = 7
the variable x keeps the value, and that the "7" object
cannot suddenly change its value to 8.

Believe it or not, in some early versions of Fortran, numbers were not
immutable! I forget the exact scenario, but if you did something like:

subroutine munge (i)
i = 3
return

and then in your main program did:

j = 7
call munge (7)
write (6, 11) j
11 format ('j = ', i6)

it would print 3! The problem is that numerical constants were interned
(i.e. in the main program, there was only a single 7 stored in memory,
and both uses of 7 referred to the same memory location), and the value
passed to the subroutine was call by reference. It was almost as if the
compiler let you say "7 = 3" as an assignment statement.

Needless to say, people did indeed complain massively.
 
D

Dan Bishop

Roy said:
There has been a huge amount written about immutable types recently. A
search of the Google news archives should turn up tons of articles.

But, in a nutshell, the biggest reason for immutable types (tuples and
strings) is that this lets they be dictionary keys. If you want to know
why dictionary keys have to be immutable,

More precisely, dictionary keys can't be mutable in any way that
affects the result of the hash function or the == or != operators.
 
R

Roy Smith

"Dan Bishop said:
More precisely, dictionary keys can't be mutable in any way that
affects the result of the hash function or the == or != operators.

Yes, this is true. In fact, I spent a fair amount of time in the
referenced thread arguing exactly that point. I was hoping people would
take my hint, read the thread, and find that out :)
 
F

Fredrik Lundh

Roy said:
But, in a nutshell, the biggest reason for immutable types (tuples and
strings) is that this lets they be dictionary keys.

if you think that's the biggest reason, you haven't spent enough time working
on high-performance Python code and extensions (there's a reason why some
languages adopt a "you can only assign to a variable once" approach).

</F>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top