# maximum length of a list & tuple

Discussion in 'Python' started by Lupe, Apr 9, 2004.

1. ### LupeGuest

hi,

I'm finishing a small program which uses recursion for less than 100 levels.

Each time the function is called, it compares an element of a list (which is
a list too) to other elements, which all amounts to millions of
comparisons.

The program is working for short lists but not for longer ones. I think
there are two possible problems: either disk space is not enough to save
lists as the program is executed or there is a finite manageable list
lenght that is reached by the program.

hence my question:

is there a limit? which one? I'm using list.append()

I'll rewrite that part of the code and will overcome the problem, since I
only need a few elements of the list generated, but I'm still curious about

Lupe

Lupe, Apr 9, 2004

2. ### Ed SuominenGuest

I think the limit is 1000 recursion levels, but you shouldn't need to ever
get anywhere near that. Use generators to break up the iteration of your
list comparisons.

-Ed Suominen

Lupe wrote:

> hi,
>
> I'm finishing a small program which uses recursion for less than 100
> levels.
>
> Each time the function is called, it compares an element of a list (which
> is a list too) to other elements, which all amounts to millions of
> comparisons.
>
> The program is working for short lists but not for longer ones. I think
> there are two possible problems: either disk space is not enough to save
> lists as the program is executed or there is a finite manageable list
> lenght that is reached by the program.
>
> hence my question:
>
> is there a limit? which one? I'm using list.append()
>
> I'll rewrite that part of the code and will overcome the problem, since I
> only need a few elements of the list generated, but I'm still curious
>
>
> Lupe

Ed Suominen, Apr 9, 2004

3. ### LupeGuest

I just would like to know if there is any limit to a list or tuple.

Lupe, Apr 9, 2004
4. ### Larry BatesGuest

I've constructed lists with 100,000+ elements and
they have worked flawlessly.

FYI,
Larry Bates
Syscon, Inc.

"Lupe" <> wrote in message
news:c54lq3\$2pe1sf\$-berlin.de...
> hi,
>
> I'm finishing a small program which uses recursion for less than 100

levels.
>
> Each time the function is called, it compares an element of a list (which

is
> a list too) to other elements, which all amounts to millions of
> comparisons.
>
> The program is working for short lists but not for longer ones. I think
> there are two possible problems: either disk space is not enough to save
> lists as the program is executed or there is a finite manageable list
> lenght that is reached by the program.
>
> hence my question:
>
> is there a limit? which one? I'm using list.append()
>
> I'll rewrite that part of the code and will overcome the problem, since I
> only need a few elements of the list generated, but I'm still curious

>
>
> Lupe

Larry Bates, Apr 9, 2004
5. ### Josiah CarlsonGuest

> I just would like to know if there is any limit to a list or tuple.

It depends on how much memory you have.

Run the following and the last thing it prints out is your limit...

c = 1
while c < 2**32:
try:
d = [1]*c
print c
c *= 2
except:
break

- Josiah

Josiah Carlson, Apr 10, 2004
6. ### Tim RobertsGuest

Josiah Carlson <> wrote:

>> I just would like to know if there is any limit to a list or tuple.

>
>It depends on how much memory you have.
>
>Run the following and the last thing it prints out is your limit...
>
> c = 1
> while c < 2**32:
> try:
> d = [1]*c
> print c
> c *= 2
> except:
> break

Interesting experiment. I get 64M on my 384MB machine, which suggests 4
bytes per list entry. Of course, now my swap file is packed full, and all
my apps are running slowly while they page themselves back in...
--
- Tim Roberts,
Providenza & Boekelheide, Inc.

Tim Roberts, Apr 11, 2004
7. ### Josiah CarlsonGuest

>>Run the following and the last thing it prints out is your limit...
>>
>> c = 1
>> while c < 2**32:
>> try:
>> d = [1]*c
>> print c
>> c *= 2
>> except:
>> break

>
>
> Interesting experiment. I get 64M on my 384MB machine, which suggests 4
> bytes per list entry. Of course, now my swap file is packed full, and all
> my apps are running slowly while they page themselves back in...

Far more than 4 bytes per list entry. 4 bytes to store the integers
themselves, but since integers are Python objects, and lists are arrays
of pointers to list objects, there is quite a bit of other extra stuff
attached.

I can't remember exactly how much extra stuff, but it can be found by
looking through the sources. You could check the amount of memory used
before, and the memory used at peak, and divide it by your 64M to find
out how much is used. I just did, but I'll also leave it as an exercise

- Josiah

Josiah Carlson, Apr 11, 2004
8. ### Peter HansenGuest

Josiah Carlson wrote:

>>> Run the following and the last thing it prints out is your limit...
>>>
>>> c = 1
>>> while c < 2**32:
>>> try:
>>> d = [1]*c
>>> print c
>>> c *= 2
>>> except:
>>> break

>>
>>
>>
>> Interesting experiment. I get 64M on my 384MB machine, which suggests 4
>> bytes per list entry. Of course, now my swap file is packed full, and
>> all
>> my apps are running slowly while they page themselves back in...

>
>
> Far more than 4 bytes per list entry. 4 bytes to store the integers
> themselves, but since integers are Python objects, and lists are arrays
> of pointers to list objects, there is quite a bit of other extra stuff
> attached.

Actually, unless I'm mistaken that stores only one integer, and a whole
lot of references to it... which are just pointers (4 bytes each).

The technique is probably flawed in other ways, however, as it
is likely subject to memory fragmentation and other problems.

-Peter

Peter Hansen, Apr 12, 2004
9. ### Josiah CarlsonGuest

> Actually, unless I'm mistaken that stores only one integer, and a whole
> lot of references to it... which are just pointers (4 bytes each).
>
> The technique is probably flawed in other ways, however, as it
> is likely subject to memory fragmentation and other problems.

You can't just store the integer. How would you differentiate between
an integer in a list and a pointer? Answer: you must use PyIntObjects.
Use the source.

According to intobject.c, intobject.h and object.h, the structure of
every intobject (after the macros have been expanded) is as follows:

typedef struct {
int ob_refcnt;
_typeobject *ob_type;
long ob_ival;
} PyIntObject;

The size of each of those on a 32 bit machine is 4 bytes, plus the
pointer from the list (another 4 bytes), which leaves us with a total of
16 bytes per integer object in the list.

Checking the memory usage of Python before and after the creation of 1
million integer list, I get a /very/ consistant ~16 bytes per. That
would be 12 bytes for the object, and 4 bytes for each pointer in the
intobject.c and/or intobject.h, but yeah. Generally 16 bytes per integer.

As for memory fragmentation, considering how memory paging and
segmentation works, that's all underlying stuff that the OS deals with.
I could go on as to why it doesn't matter in this case (considering
the large mallocs necessary for the creation of large lists), but I'll
leave that as an exercise to the reader.

Again, from the source, 12 bytes for the object, 4 bytes for the pointer
in the list, total of 16 bytes per intobject in a list.

- Josiah

Josiah Carlson, Apr 12, 2004
10. ### Peter HansenGuest

Josiah Carlson wrote:

>> Actually, unless I'm mistaken that stores only one integer, and a whole
>> lot of references to it... which are just pointers (4 bytes each).
>>
>> The technique is probably flawed in other ways, however, as it
>> is likely subject to memory fragmentation and other problems.

>
> You can't just store the integer. How would you differentiate between
> an integer in a list and a pointer? Answer: you must use PyIntObjects.
> Use the source.

Python does not recognize anything called "pointers" at the language
level, only internally.

What I was saying is that the only PyIntObject created was one with
the ob_ival of 1. Then a list containing one pointer to it was
created. Then it was replicated "c" times. Only one integer.

Please refute that claim rather than just arguing a bunch of other
points, none of which I dispute but all of which I claim are irrelevant
to the question at hand.

> Checking the memory usage of Python before and after the creation of 1
> million integer list, I get a /very/ consistant ~16 bytes per. That
> would be 12 bytes for the object, and 4 bytes for each pointer in the
> list.

Please post the expression you are using, for comparison.

> I could go on as to why it doesn't matter in this case (considering the
> large mallocs necessary for the creation of large lists), but I'll leave
> that as an exercise to the reader.

Please go on about it. I'd like to see where you discuss what happens
to the previously memory that was held in earlier lists, as it iterates
through the loop creating exponentially larger lists.

> Again, from the source, 12 bytes for the object, 4 bytes for the pointer
> in the list, total of 16 bytes per intobject in a list.

Nice. So each entry in the list is 4 bytes (as I indicated). And if
they all point to a single instance of the integer 1, how many extra
bytes do you think that allocates?

-Peter

Peter Hansen, Apr 12, 2004
11. ### Peter HansenGuest

Peter Hansen wrote:

> Josiah Carlson wrote:
>
>> Checking the memory usage of Python before and after the creation of 1
>> million integer list, I get a /very/ consistant ~16 bytes per. That
>> would be 12 bytes for the object, and 4 bytes for each pointer in the
>> list.

>
> Please post the expression you are using, for comparison.

By the way, in case it wasn't clear, we have a disagreement (and perhaps
a misunderstanding) only about the meaning of the original code posted.
You believe it is allocating many integers. I believe it allocates
only a single one. Perhaps one of us is wrong. I hadn't read the code
in great detail, and still haven't, and perhaps that's my mistake.
Depending on your reply, I'll actually go back and look at it closely
and perhaps execute it myself and see the result (in terms of the list
produced, not in terms of the memory consumption). Maybe you will
do the same and we'll identify which of us is mistaken...

-Peter

Peter Hansen, Apr 12, 2004
12. ### Josiah CarlsonGuest

>> You can't just store the integer. How would you differentiate between
>> an integer in a list and a pointer? Answer: you must use
>> PyIntObjects. Use the source.

>
>
> Python does not recognize anything called "pointers" at the language
> level, only internally.
>
> What I was saying is that the only PyIntObject created was one with
> the ob_ival of 1. Then a list containing one pointer to it was
> created. Then it was replicated "c" times. Only one integer.

Yeah, I was thinking of the case in generating a sequence of integers
via range. In that case, range(N) produces 12-byte intobjects and 4
byte pointers. In the case of c*[1), indeed you are right, one object
gets created, and some c pointers to that object are generated for the list.

Seems I got off topic and we are both right.

- Josiah

Josiah Carlson, Apr 12, 2004
13. ### Bengt RichterGuest

On Mon, 12 Apr 2004 07:22:26 -0400, Peter Hansen <> wrote:

>Peter Hansen wrote:
>
>> Josiah Carlson wrote:
>>
>>> Checking the memory usage of Python before and after the creation of 1
>>> million integer list, I get a /very/ consistant ~16 bytes per. That
>>> would be 12 bytes for the object, and 4 bytes for each pointer in the
>>> list.

>>
>> Please post the expression you are using, for comparison.

>
>By the way, in case it wasn't clear, we have a disagreement (and perhaps
>a misunderstanding) only about the meaning of the original code posted.
>You believe it is allocating many integers. I believe it allocates
>only a single one. Perhaps one of us is wrong. I hadn't read the code
>in great detail, and still haven't, and perhaps that's my mistake.
>Depending on your reply, I'll actually go back and look at it closely
>and perhaps execute it myself and see the result (in terms of the list
>produced, not in terms of the memory consumption). Maybe you will
>do the same and we'll identify which of us is mistaken...
>
>>> dict([(id(x),x) for x in [1]*100])

{7946208: 1}
>>> dict([(id(x),x) for x in range(5)])

{7939088: 0, 7946208: 1, 7943184: 4, 7945200: 2, 7944192: 3}
>>> dict([(id(x),x) for x in range(5)*10])

{7939088: 0, 7946208: 1, 7943184: 4, 7945200: 2, 7944192: 3}

>>> for v in -6,-5,0,1,99,100:

... print dict([(id(x),x) for x in [v for i in xrange(5)]])
...
{8053508: -6}
{7936720: -5}
{7939088: 0}
{7946208: 1}
{8042496: 99}
{8166456: 100}
>>> for v in -6,-5,0,1,99,100:

... print dict([(id(x),x) for x in [v*1 for i in xrange(5)]])
...
{8166384: -6, 8166444: -6, 8166492: -6, 8166360: -6, 8166468: -6}
{7936720: -5}
{7939088: 0}
{7946208: 1}
{8042496: 99}
{8166360: 100, 8166468: 100, 8166480: 100, 8166348: 100, 8166492: 100}

Regards,
Bengt Richter

Bengt Richter, Apr 12, 2004
14. ### LupeGuest

thank you everyone! eom

....

Lupe, Apr 13, 2004
15. ### SteveGuest

Lupe wrote:

> I just would like to know if there is any limit to a list or tuple.

Yes. Since the universe only contains something like
10**85 particles of matter, the upper limit to a list
is roughly 2**(10**85) bits.

Sorry, I couldn't resist :-D

You can store as many things in a list as you have
memory for. I suppose there is probably some
fundamental limit due to 32 bit addressing issues, but
are you sure you are really storing so many items you
are hitting that limit? I'd guess you aren't.

For example, the other day I was playing around with
Python, and just for a lark I created a list with
300,000,000 instances of None. It took about five
minutes, but it worked

Going back to your original problem:

> Each time the function is called, it compares an
> element of a list (which is a list too) to other
> elements, which all amounts to millions of
> comparisons.

Are you sure there isn't a more efficient algorithm for
what you are trying to do?

> The program is working for short lists but not for
> longer ones.

What do you mean it is working? It doesn't finish? It
is too slow? Or it gives the wrong answer?

If it is giving a wrong answer, there is a bug in your
code. If it is running too slowly, chances are there is
a better algorithm that will fix it.

--
Steven D'Aprano

Steve, Apr 14, 2004
16. ### SteveGuest

Lupe wrote:

> I just would like to know if there is any limit to a list or tuple.

Yes. Since the universe only contains something like
10**85 particles of matter, the upper limit to a list
is roughly 2**(10**85) bits.

Sorry, I couldn't resist :-D

You can store as many things in a list as you have
memory for. I suppose there is probably some
fundamental limit due to 32 bit addressing issues, but
are you sure you are really storing so many items you
are hitting that limit? I'd guess you aren't.

For example, the other day I was playing around with
Python, and just for a lark I created a list with
300,000,000 instances of None. It took about five
minutes, but it worked

Going back to your original problem:

> Each time the function is called, it compares an
> element of a list (which is a list too) to other
> elements, which all amounts to millions of
> comparisons.

Are you sure there isn't a more efficient algorithm for
what you are trying to do?

> The program is working for short lists but not for
> longer ones.

What do you mean it is working? It doesn't finish? It
is too slow? Or it gives the wrong answer?

If it is giving a wrong answer, there is a bug in your
code. If it is running too slowly, chances are there is
a better algorithm that will fix it.

--
Steven D'Aprano

Steve, Apr 14, 2004