id functions of ints, floats and strings

zillow10 · Apr 3, 2008

Hi all,

I've been playing around with the identity function id() for different
types of objects, and I think I understand its behaviour when it comes
to objects like lists and tuples in which case an assignment r2 = r1
(r1 refers to an existing object) creates an alias r2 that refers to
the same object as r1. In this case id(r1) == id(r2) (or, if you
like: r1 is r2). However for r1, r2 assigned as follows: r1 = [1, 2,
3] and r2 = [1, 2, 3], (r1 is r2) is False, even if r1==r2,
etc. ...this is all very well. Therefore, it seems that id(r) can be
interpreted as the address of the object that 'r' refers to.

My observations of its behaviour when comparing ints, floats and
strings have raised some questions in my mind, though. Consider the
following examples:

#########################################################################

# (1) turns out to be true
a = 10
b = 10
print a is b

# (2) turns out to be false
f = 10.0
g = 10.0
print f is g

# behaviour when a list or tuple contains the same elements ("same"
meaning same type and value):

# define the following function, that checks if all the elements in an
iterable object are equal:

def areAllElementsEqual(iterable):
return reduce(lambda x, y: x == y and x, iterable) != False

# (3) checking if ids of all list elements are the same for different
cases:

a = 3*[1]; areAllElementsEqual([id(i) for i in a]) # True
b = [1, 1, 1]; areAllElementsEqual([id(i) for i in b]) # True
f = 3*[1.0]; areAllElementsEqual([id(i) for i in f]) # True
g = [1.0, 1.0, 1.0]; areAllElementsEqual([id(i) for i in g]) # True
g1 = [1.0, 1.0, 0.5+0.5]; areAllElementsEqual([id(i) for i in g1]) #
False

# (4) two equal floats defined inside a function body behave
differently than case (1):

def func():
f = 10.0
g = 10.0
return f is g

print func() # True

######################################################

I didn't mention any examples with strings; they behaved like ints
with respect to their id properties for all the cases I tried.
While I have no particular qualms about the behaviour, I have the
following questions:

1) Which of the above behaviours are reliable? For example, does a1 =
a2 for ints and strings always imply that a1 is a2?
2) From the programmer's perspective, are ids of ints, floats and
string of any practical significance at all (since these types are
immutable)?
3) Does the behaviour of ids for lists and tuples of the same element
(of type int, string and sometimes even float), imply that the tuple a
= (1,) takes (nearly) the same storage space as a = 10000*(1,)? (What
about a list, where elements can be changed at will?)

Would appreciate your responses...

AK

zillow10 · Apr 3, 2008

Hi all,

I've been playing around with the identity function id() for different
types of objects, and I think I understand its behaviour when it comes
to objects like lists and tuples in which case an assignment r2 = r1
(r1 refers to an existing object) creates an alias r2 that refers to
the same object as r1. In this case id(r1) == id(r2) (or, if you
like: r1 is r2). However for r1, r2 assigned as follows: r1 = [1, 2,
3] and r2 = [1, 2, 3], (r1 is r2) is False, even if r1==r2,
etc. ...this is all very well. Therefore, it seems that id(r) can be
interpreted as the address of the object that 'r' refers to.

My observations of its behaviour when comparing ints, floats and
strings have raised some questions in my mind, though. Consider the
following examples:

#########################################################################

# (1) turns out to be true
a = 10
b = 10
print a is b

# (2) turns out to be false
f = 10.0
g = 10.0
print f is g

# behaviour when a list or tuple contains the same elements ("same"
meaning same type and value):

# define the following function, that checks if all the elements in an
iterable object are equal:

def areAllElementsEqual(iterable):
return reduce(lambda x, y: x == y and x, iterable) != False

# (3) checking if ids of all list elements are the same for different
cases:

a = 3*[1]; areAllElementsEqual([id(i) for i in a]) # True
b = [1, 1, 1]; areAllElementsEqual([id(i) for i in b]) # True
f = 3*[1.0]; areAllElementsEqual([id(i) for i in f]) # True
g = [1.0, 1.0, 1.0]; areAllElementsEqual([id(i) for i in g]) # True
g1 = [1.0, 1.0, 0.5+0.5]; areAllElementsEqual([id(i) for i in g1]) #
False

# (4) two equal floats defined inside a function body behave
differently than case (1):

def func():
f = 10.0
g = 10.0
return f is g

print func() # True

######################################################

I didn't mention any examples with strings; they behaved like ints
with respect to their id properties for all the cases I tried.
While I have no particular qualms about the behaviour, I have the
following questions:

1) Which of the above behaviours are reliable? For example, does a1 =
a2 for ints and strings always imply that a1 is a2?
2) From the programmer's perspective, are ids of ints, floats and
string of any practical significance at all (since these types are
immutable)?
3) Does the behaviour of ids for lists and tuples of the same element
(of type int, string and sometimes even float), imply that the tuple a
= (1,) takes (nearly) the same storage space as a = 10000*(1,)? (What
about a list, where elements can be changed at will?)

Would appreciate your responses...

AK

Question 1 should read "For example, does a1 == a2 for ints ..."

George Sakkis · Apr 3, 2008

[email protected] said:
1) Which of the above behaviours are reliable? For example, does a1 =
a2 for ints and strings always imply that a1 is a2?
No.

2) From the programmer's perspective, are ids of ints, floats and
string of any practical significance at all (since these types are
immutable)?
No.

3) Does the behaviour of ids for lists and tuples of the same element
(of type int, string and sometimes even float), imply that the tuple a
= (1,) takes (nearly) the same storage space as a = 10000*(1,)? (What
about a list, where elements can be changed at will?)

No.

Regards,
George

Daniel Fetchinson · Apr 3, 2008

Hi all,

I've been playing around with the identity function id() for different
types of objects, and I think I understand its behaviour when it comes
to objects like lists and tuples in which case an assignment r2 = r1
(r1 refers to an existing object) creates an alias r2 that refers to
the same object as r1. In this case id(r1) == id(r2) (or, if you
like: r1 is r2). However for r1, r2 assigned as follows: r1 = [1, 2,
3] and r2 = [1, 2, 3], (r1 is r2) is False, even if r1==r2,
etc. ...this is all very well. Therefore, it seems that id(r) can be
interpreted as the address of the object that 'r' refers to.

My observations of its behaviour when comparing ints, floats and
strings have raised some questions in my mind, though. Consider the
following examples:

#########################################################################

# (1) turns out to be true
a = 10
b = 10
print a is b

# (2) turns out to be false
f = 10.0
g = 10.0
print f is g

# behaviour when a list or tuple contains the same elements ("same"
meaning same type and value):

# define the following function, that checks if all the elements in an
iterable object are equal:

def areAllElementsEqual(iterable):
return reduce(lambda x, y: x == y and x, iterable) != False

# (3) checking if ids of all list elements are the same for different
cases:

a = 3*[1]; areAllElementsEqual([id(i) for i in a]) # True
b = [1, 1, 1]; areAllElementsEqual([id(i) for i in b]) # True
f = 3*[1.0]; areAllElementsEqual([id(i) for i in f]) # True
g = [1.0, 1.0, 1.0]; areAllElementsEqual([id(i) for i in g]) # True
g1 = [1.0, 1.0, 0.5+0.5]; areAllElementsEqual([id(i) for i in g1]) #
False

# (4) two equal floats defined inside a function body behave
differently than case (1):

def func():
f = 10.0
g = 10.0
return f is g

print func() # True

######################################################

I didn't mention any examples with strings; they behaved like ints
with respect to their id properties for all the cases I tried.
While I have no particular qualms about the behaviour, I have the
following questions:

Small integers and short strings are cached and reused and for these (
r1 == r2 ) implies ( r1 is r2 ). For longer strings or larger integers
this does not happen and so in general ( r1 == r2 ) does not imply (
r1 is r2 ). The caching and reuse is for performance gains and is an
implementation detail which should not be relied upon.

1) Which of the above behaviours are reliable? For example, does a1 =
a2 for ints and strings always imply that a1 is a2?

No, see above.

2) From the programmer's perspective, are ids of ints, floats and
string of any practical significance at all (since these types are
immutable)?

You can check identity (and not equality) with them. So whenever you
need that they are practically useful if all you need is equality they
are useless.

3) Does the behaviour of ids for lists and tuples of the same element
(of type int, string and sometimes even float), imply that the tuple a
= (1,) takes (nearly) the same storage space as a = 10000*(1,)? (What
about a list, where elements can be changed at will?)

I'm not sure about tuples but for lists the storage space needed for
10000*(1,) is roughly 10000 times more than for (1,).

Would appreciate your responses...

HTH,
Daniel

Gabriel Genellina · Apr 4, 2008

En Thu said:
Hi all,

I've been playing around with the identity function id() for different
types of objects, and I think I understand its behaviour when it comes
to objects like lists and tuples in which case an assignment r2 = r1
(r1 refers to an existing object) creates an alias r2 that refers to
the same object as r1. In this case id(r1) == id(r2) (or, if you
like: r1 is r2). However for r1, r2 assigned as follows: r1 = [1, 2,
3] and r2 = [1, 2, 3], (r1 is r2) is False, even if r1==r2,
etc. ...this is all very well. Therefore, it seems that id(r) can be
interpreted as the address of the object that 'r' refers to.

My observations of its behaviour when comparing ints, floats and
strings have raised some questions in my mind, though. Consider the
following examples:

#########################################################################

# (1) turns out to be true
a = 10
b = 10
print a is b

....only because CPython happens to cache small integers and return always
the same object. Try again with 10000. This is just an optimization and
the actual range of cached integer, or whether they are cached at all, is
implementation (and version) dependent.
(As integers are immutable, the optimization *can* be done, but that
doesn't mean that all immutable objects are always shared).

# (2) turns out to be false
f = 10.0
g = 10.0
print f is g

Because the above optimization isn't used for floats.
The `is` operator checks object identity: whether both operands are the
very same object (*not* a copy, or being equal: the *same* object)
("identity" is a primitive concept)
The only way to guarantee that you are talking of the same object, is
using a reference to a previously created object. That is:

a = some_arbitrary_object
b = a
assert a is b

The name `b` now refers to the same object as name `a`; the assertion
holds for whatever object it is.

In other cases, like (1) and (2) above, the literals are just handy
constructors for int and float objects. You have two objects constructed
(a and b, f and g). Whether they are identical or not is not defined; they
might be the same, or not, depending on unknown factors that might include
the moon phase; both alternatives are valid Python.

# (3) checking if ids of all list elements are the same for different
cases:

a = 3*[1]; areAllElementsEqual([id(i) for i in a]) # True
b = [1, 1, 1]; areAllElementsEqual([id(i) for i in b]) # True
f = 3*[1.0]; areAllElementsEqual([id(i) for i in f]) # True
g = [1.0, 1.0, 1.0]; areAllElementsEqual([id(i) for i in g]) # True
g1 = [1.0, 1.0, 0.5+0.5]; areAllElementsEqual([id(i) for i in g1]) #
False

Again, this is implementation dependent. If you try with a different
Python version or a different implementation you may get other results -
and that doesn't mean that any of them is broken.

# (4) two equal floats defined inside a function body behave
differently than case (1):

def func():
f = 10.0
g = 10.0
return f is g

print func() # True

Another implementation detail related to co_consts. You shouldn't rely on
it.

I didn't mention any examples with strings; they behaved like ints
with respect to their id properties for all the cases I tried.

You didn't try hard enough

py> x = "abc"
py> y = ''.join(x)
py> x == y
True
py> x is y
False

Long strings behave like big integers: they aren't cached:

py> x = "a rather long string, full of garbage. No, this isn't garbage,
just non
sense text to fill space."
py> y = "a rather long string, full of garbage. No, this isn't garbage,
just non
sense text to fill space."
py> x == y
True
py> x is y
False

As always: you have two statements constructing two objects. Whether they
return the same object or not, it's not defined.

While I have no particular qualms about the behaviour, I have the
following questions:

1) Which of the above behaviours are reliable? For example, does a1 =
a2 for ints and strings always imply that a1 is a2?

If you mean:

a1 = something
a2 = a1
a1 is a2

then, from my comments above, you should be able to answer: yes, always,
not restricted to ints and strings.

If you mean:

a1 = someliteral
a2 = someliteral
a1 is a2

then: no, it isn't guaranteed at all, nor even for small integers or
strings.

2) From the programmer's perspective, are ids of ints, floats and
string of any practical significance at all (since these types are
immutable)?

The same significance as id() of any other object... mostly, none, except
for debugging purposes.

3) Does the behaviour of ids for lists and tuples of the same element
(of type int, string and sometimes even float), imply that the tuple a
= (1,) takes (nearly) the same storage space as a = 10000*(1,)? (What
about a list, where elements can be changed at will?)

That's a different thing. A tuple maintains only references to its
elements (as any other object in Python). The memory required for a tuple
(I'm talking of CPython exclusively) is: (a small header) + n *
sizeof(pointer). So the expression 10000*(anything,) will take more space
than the singleton (anything,) because the former requires space for 10000
pointers and the latter just one.

You have to take into account the memory for the elements themselves; but
in both cases there is a *single* object referenced, so it doesn't matter.
Note that it doesn't matter whether that single element is an integer, a
string, mutable or immutable object: it's always the same object, already
existing, and creating that 10000-uple just increments its reference count
by 10000.

The situation is similar for lists, except that being mutable containers,
they're over-allocated (to have room for future expansion). So the list
[anything]*10000 has a size somewhat larger than 10000*sizeof(pointer);
its (only) element increments its reference count by 10000.

Steve Holden · Apr 6, 2008

Gabriel said:
En Thu said:

Hi all,

I've been playing around with the identity function id() for different
types of objects, and I think I understand its behaviour when it comes
to objects like lists and tuples in which case an assignment r2 = r1
(r1 refers to an existing object) creates an alias r2 that refers to
the same object as r1. In this case id(r1) == id(r2) (or, if you
like: r1 is r2). However for r1, r2 assigned as follows: r1 = [1, 2,
3] and r2 = [1, 2, 3], (r1 is r2) is False, even if r1==r2,
etc. ...this is all very well. Therefore, it seems that id(r) can be
interpreted as the address of the object that 'r' refers to.

My observations of its behaviour when comparing ints, floats and
strings have raised some questions in my mind, though. Consider the
following examples:

#########################################################################

# (1) turns out to be true
a = 10
b = 10
print a is b

Click to expand...

...only because CPython happens to cache small integers and return always
the same object. Try again with 10000. This is just an optimization and
the actual range of cached integer, or whether they are cached at all, is
implementation (and version) dependent.
(As integers are immutable, the optimization *can* be done, but that
doesn't mean that all immutable objects are always shared).

# (2) turns out to be false
f = 10.0
g = 10.0
print f is g

Click to expand...

Because the above optimization isn't used for floats.
The `is` operator checks object identity: whether both operands are the
very same object (*not* a copy, or being equal: the *same* object)
("identity" is a primitive concept)
The only way to guarantee that you are talking of the same object, is
using a reference to a previously created object. That is:

a = some_arbitrary_object
b = a
assert a is b

The name `b` now refers to the same object as name `a`; the assertion
holds for whatever object it is.

In other cases, like (1) and (2) above, the literals are just handy
constructors for int and float objects. You have two objects constructed
(a and b, f and g). Whether they are identical or not is not defined; they
might be the same, or not, depending on unknown factors that might include
the moon phase; both alternatives are valid Python.

# (3) checking if ids of all list elements are the same for different
cases:

a = 3*[1]; areAllElementsEqual([id(i) for i in a]) # True
b = [1, 1, 1]; areAllElementsEqual([id(i) for i in b]) # True
f = 3*[1.0]; areAllElementsEqual([id(i) for i in f]) # True
g = [1.0, 1.0, 1.0]; areAllElementsEqual([id(i) for i in g]) # True
g1 = [1.0, 1.0, 0.5+0.5]; areAllElementsEqual([id(i) for i in g1]) #
False

Click to expand...

Again, this is implementation dependent. If you try with a different
Python version or a different implementation you may get other results -
and that doesn't mean that any of them is broken.

# (4) two equal floats defined inside a function body behave
differently than case (1):

def func():
f = 10.0
g = 10.0
return f is g

print func() # True

Click to expand...

Another implementation detail related to co_consts. You shouldn't rely on
it.

I didn't mention any examples with strings; they behaved like ints
with respect to their id properties for all the cases I tried.

Click to expand...

You didn't try hard enough

py> x = "abc"
py> y = ''.join(x)
py> x == y
True
py> x is y
False

Long strings behave like big integers: they aren't cached:

py> x = "a rather long string, full of garbage. No, this isn't garbage,
just non
sense text to fill space."
py> y = "a rather long string, full of garbage. No, this isn't garbage,
just non
sense text to fill space."
py> x == y
True
py> x is y
False

As always: you have two statements constructing two objects. Whether they
return the same object or not, it's not defined.

While I have no particular qualms about the behaviour, I have the
following questions:

1) Which of the above behaviours are reliable? For example, does a1 =
a2 for ints and strings always imply that a1 is a2?

Click to expand...

If you mean:

a1 = something
a2 = a1
a1 is a2

then, from my comments above, you should be able to answer: yes, always,
not restricted to ints and strings.

If you mean:

a1 = someliteral
a2 = someliteral
a1 is a2

then: no, it isn't guaranteed at all, nor even for small integers or
strings.

2) From the programmer's perspective, are ids of ints, floats and
string of any practical significance at all (since these types are
immutable)?

Click to expand...

The same significance as id() of any other object... mostly, none, except
for debugging purposes.

3) Does the behaviour of ids for lists and tuples of the same element
(of type int, string and sometimes even float), imply that the tuple a
= (1,) takes (nearly) the same storage space as a = 10000*(1,)? (What
about a list, where elements can be changed at will?)

Click to expand...

That's a different thing. A tuple maintains only references to its
elements (as any other object in Python). The memory required for a tuple
(I'm talking of CPython exclusively) is: (a small header) + n *
sizeof(pointer). So the expression 10000*(anything,) will take more space
than the singleton (anything,) because the former requires space for 10000
pointers and the latter just one.

You have to take into account the memory for the elements themselves; but
in both cases there is a *single* object referenced, so it doesn't matter.
Note that it doesn't matter whether that single element is an integer, a
string, mutable or immutable object: it's always the same object, already
existing, and creating that 10000-uple just increments its reference count
by 10000.

The situation is similar for lists, except that being mutable containers,
they're over-allocated (to have room for future expansion). So the list
[anything]*10000 has a size somewhat larger than 10000*sizeof(pointer);
its (only) element increments its reference count by 10000.

In fact all you can in truth say is that

a is b --> a == b

The converse definitely not true.

regards
Steve

Dan Bishop · Apr 6, 2008

In fact all you can in truth say is that

a is b --> a == b

You can't even guarantee that.
False

Average of MultiMode of a list of a list	1	Oct 28, 2022
Comparing floats	6	Nov 27, 2010
Exact integer-valued floats	12	Sep 21, 2012
Passing ints to a function	4	Jun 9, 2012
How can I guarantee that the all callback functions of the first Ajax API call have finished executing before initiating the 2 call in JavaScript?	2	Oct 30, 2023
Calculate rang and derang of ordering of subsets	1	Feb 6, 2024
Calculate rang and derang of ordering of subsets	0	Feb 6, 2024
mutable ints: I think I have painted myself into a corner	0	May 19, 2013

id functions of ints, floats and strings

zillow10

zillow10

George Sakkis

Daniel Fetchinson

Gabriel Genellina

Steve Holden

Dan Bishop

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads