maximum length of a list & tuple

Discussion in 'Python' started by Lupe, Apr 9, 2004.

  1. Lupe

    Lupe Guest

    hi,

    I'm finishing a small program which uses recursion for less than 100 levels.

    Each time the function is called, it compares an element of a list (which is
    a list too) to other elements, which all amounts to millions of
    comparisons.

    The program is working for short lists but not for longer ones. I think
    there are two possible problems: either disk space is not enough to save
    lists as the program is executed or there is a finite manageable list
    lenght that is reached by the program.

    hence my question:

    is there a limit? which one? I'm using list.append()
    what about for tuples?

    I'll rewrite that part of the code and will overcome the problem, since I
    only need a few elements of the list generated, but I'm still curious about
    the answers

    Thanks in advanve

    Lupe
     
    Lupe, Apr 9, 2004
    #1
    1. Advertising

  2. Lupe

    Ed Suominen Guest

    I think the limit is 1000 recursion levels, but you shouldn't need to ever
    get anywhere near that. Use generators to break up the iteration of your
    list comparisons.

    -Ed Suominen

    Lupe wrote:

    > hi,
    >
    > I'm finishing a small program which uses recursion for less than 100
    > levels.
    >
    > Each time the function is called, it compares an element of a list (which
    > is a list too) to other elements, which all amounts to millions of
    > comparisons.
    >
    > The program is working for short lists but not for longer ones. I think
    > there are two possible problems: either disk space is not enough to save
    > lists as the program is executed or there is a finite manageable list
    > lenght that is reached by the program.
    >
    > hence my question:
    >
    > is there a limit? which one? I'm using list.append()
    > what about for tuples?
    >
    > I'll rewrite that part of the code and will overcome the problem, since I
    > only need a few elements of the list generated, but I'm still curious
    > about the answers
    >
    > Thanks in advanve
    >
    > Lupe
     
    Ed Suominen, Apr 9, 2004
    #2
    1. Advertising

  3. Lupe

    Lupe Guest

    I just would like to know if there is any limit to a list or tuple.
     
    Lupe, Apr 9, 2004
    #3
  4. Lupe

    Larry Bates Guest

    I've constructed lists with 100,000+ elements and
    they have worked flawlessly.

    FYI,
    Larry Bates
    Syscon, Inc.

    "Lupe" <> wrote in message
    news:c54lq3$2pe1sf$-berlin.de...
    > hi,
    >
    > I'm finishing a small program which uses recursion for less than 100

    levels.
    >
    > Each time the function is called, it compares an element of a list (which

    is
    > a list too) to other elements, which all amounts to millions of
    > comparisons.
    >
    > The program is working for short lists but not for longer ones. I think
    > there are two possible problems: either disk space is not enough to save
    > lists as the program is executed or there is a finite manageable list
    > lenght that is reached by the program.
    >
    > hence my question:
    >
    > is there a limit? which one? I'm using list.append()
    > what about for tuples?
    >
    > I'll rewrite that part of the code and will overcome the problem, since I
    > only need a few elements of the list generated, but I'm still curious

    about
    > the answers
    >
    > Thanks in advanve
    >
    > Lupe
     
    Larry Bates, Apr 9, 2004
    #4
  5. > I just would like to know if there is any limit to a list or tuple.

    It depends on how much memory you have.

    Run the following and the last thing it prints out is your limit...

    c = 1
    while c < 2**32:
    try:
    d = [1]*c
    print c
    c *= 2
    except:
    break


    - Josiah
     
    Josiah Carlson, Apr 10, 2004
    #5
  6. Lupe

    Tim Roberts Guest

    Josiah Carlson <> wrote:

    >> I just would like to know if there is any limit to a list or tuple.

    >
    >It depends on how much memory you have.
    >
    >Run the following and the last thing it prints out is your limit...
    >
    > c = 1
    > while c < 2**32:
    > try:
    > d = [1]*c
    > print c
    > c *= 2
    > except:
    > break


    Interesting experiment. I get 64M on my 384MB machine, which suggests 4
    bytes per list entry. Of course, now my swap file is packed full, and all
    my apps are running slowly while they page themselves back in...
    --
    - Tim Roberts,
    Providenza & Boekelheide, Inc.
     
    Tim Roberts, Apr 11, 2004
    #6
  7. >>Run the following and the last thing it prints out is your limit...
    >>
    >> c = 1
    >> while c < 2**32:
    >> try:
    >> d = [1]*c
    >> print c
    >> c *= 2
    >> except:
    >> break

    >
    >
    > Interesting experiment. I get 64M on my 384MB machine, which suggests 4
    > bytes per list entry. Of course, now my swap file is packed full, and all
    > my apps are running slowly while they page themselves back in...


    Far more than 4 bytes per list entry. 4 bytes to store the integers
    themselves, but since integers are Python objects, and lists are arrays
    of pointers to list objects, there is quite a bit of other extra stuff
    attached.

    I can't remember exactly how much extra stuff, but it can be found by
    looking through the sources. You could check the amount of memory used
    before, and the memory used at peak, and divide it by your 64M to find
    out how much is used. I just did, but I'll also leave it as an exercise
    to the reader.

    - Josiah
     
    Josiah Carlson, Apr 11, 2004
    #7
  8. Lupe

    Peter Hansen Guest

    Josiah Carlson wrote:

    >>> Run the following and the last thing it prints out is your limit...
    >>>
    >>> c = 1
    >>> while c < 2**32:
    >>> try:
    >>> d = [1]*c
    >>> print c
    >>> c *= 2
    >>> except:
    >>> break

    >>
    >>
    >>
    >> Interesting experiment. I get 64M on my 384MB machine, which suggests 4
    >> bytes per list entry. Of course, now my swap file is packed full, and
    >> all
    >> my apps are running slowly while they page themselves back in...

    >
    >
    > Far more than 4 bytes per list entry. 4 bytes to store the integers
    > themselves, but since integers are Python objects, and lists are arrays
    > of pointers to list objects, there is quite a bit of other extra stuff
    > attached.


    Actually, unless I'm mistaken that stores only one integer, and a whole
    lot of references to it... which are just pointers (4 bytes each).

    The technique is probably flawed in other ways, however, as it
    is likely subject to memory fragmentation and other problems.

    -Peter
     
    Peter Hansen, Apr 12, 2004
    #8
  9. > Actually, unless I'm mistaken that stores only one integer, and a whole
    > lot of references to it... which are just pointers (4 bytes each).
    >
    > The technique is probably flawed in other ways, however, as it
    > is likely subject to memory fragmentation and other problems.


    You can't just store the integer. How would you differentiate between
    an integer in a list and a pointer? Answer: you must use PyIntObjects.
    Use the source.

    According to intobject.c, intobject.h and object.h, the structure of
    every intobject (after the macros have been expanded) is as follows:

    typedef struct {
    int ob_refcnt;
    _typeobject *ob_type;
    long ob_ival;
    } PyIntObject;

    The size of each of those on a 32 bit machine is 4 bytes, plus the
    pointer from the list (another 4 bytes), which leaves us with a total of
    16 bytes per integer object in the list.

    Checking the memory usage of Python before and after the creation of 1
    million integer list, I get a /very/ consistant ~16 bytes per. That
    would be 12 bytes for the object, and 4 bytes for each pointer in the
    list. There is some additional overhead that you'd notice in
    intobject.c and/or intobject.h, but yeah. Generally 16 bytes per integer.

    As for memory fragmentation, considering how memory paging and
    segmentation works, that's all underlying stuff that the OS deals with.
    I could go on as to why it doesn't matter in this case (considering
    the large mallocs necessary for the creation of large lists), but I'll
    leave that as an exercise to the reader.


    Again, from the source, 12 bytes for the object, 4 bytes for the pointer
    in the list, total of 16 bytes per intobject in a list.

    - Josiah
     
    Josiah Carlson, Apr 12, 2004
    #9
  10. Lupe

    Peter Hansen Guest

    Josiah Carlson wrote:

    >> Actually, unless I'm mistaken that stores only one integer, and a whole
    >> lot of references to it... which are just pointers (4 bytes each).
    >>
    >> The technique is probably flawed in other ways, however, as it
    >> is likely subject to memory fragmentation and other problems.

    >
    > You can't just store the integer. How would you differentiate between
    > an integer in a list and a pointer? Answer: you must use PyIntObjects.
    > Use the source.


    Python does not recognize anything called "pointers" at the language
    level, only internally.

    What I was saying is that the only PyIntObject created was one with
    the ob_ival of 1. Then a list containing one pointer to it was
    created. Then it was replicated "c" times. Only one integer.

    Please refute that claim rather than just arguing a bunch of other
    points, none of which I dispute but all of which I claim are irrelevant
    to the question at hand.

    > Checking the memory usage of Python before and after the creation of 1
    > million integer list, I get a /very/ consistant ~16 bytes per. That
    > would be 12 bytes for the object, and 4 bytes for each pointer in the
    > list.


    Please post the expression you are using, for comparison.

    > I could go on as to why it doesn't matter in this case (considering the
    > large mallocs necessary for the creation of large lists), but I'll leave
    > that as an exercise to the reader.


    Please go on about it. I'd like to see where you discuss what happens
    to the previously memory that was held in earlier lists, as it iterates
    through the loop creating exponentially larger lists.

    > Again, from the source, 12 bytes for the object, 4 bytes for the pointer
    > in the list, total of 16 bytes per intobject in a list.


    Nice. So each entry in the list is 4 bytes (as I indicated). And if
    they all point to a single instance of the integer 1, how many extra
    bytes do you think that allocates?

    -Peter
     
    Peter Hansen, Apr 12, 2004
    #10
  11. Lupe

    Peter Hansen Guest

    Peter Hansen wrote:

    > Josiah Carlson wrote:
    >
    >> Checking the memory usage of Python before and after the creation of 1
    >> million integer list, I get a /very/ consistant ~16 bytes per. That
    >> would be 12 bytes for the object, and 4 bytes for each pointer in the
    >> list.

    >
    > Please post the expression you are using, for comparison.


    By the way, in case it wasn't clear, we have a disagreement (and perhaps
    a misunderstanding) only about the meaning of the original code posted.
    You believe it is allocating many integers. I believe it allocates
    only a single one. Perhaps one of us is wrong. I hadn't read the code
    in great detail, and still haven't, and perhaps that's my mistake.
    Depending on your reply, I'll actually go back and look at it closely
    and perhaps execute it myself and see the result (in terms of the list
    produced, not in terms of the memory consumption). Maybe you will
    do the same and we'll identify which of us is mistaken...

    -Peter
     
    Peter Hansen, Apr 12, 2004
    #11
  12. >> You can't just store the integer. How would you differentiate between
    >> an integer in a list and a pointer? Answer: you must use
    >> PyIntObjects. Use the source.

    >
    >
    > Python does not recognize anything called "pointers" at the language
    > level, only internally.
    >
    > What I was saying is that the only PyIntObject created was one with
    > the ob_ival of 1. Then a list containing one pointer to it was
    > created. Then it was replicated "c" times. Only one integer.


    Yeah, I was thinking of the case in generating a sequence of integers
    via range. In that case, range(N) produces 12-byte intobjects and 4
    byte pointers. In the case of c*[1), indeed you are right, one object
    gets created, and some c pointers to that object are generated for the list.

    Seems I got off topic and we are both right.

    - Josiah
     
    Josiah Carlson, Apr 12, 2004
    #12
  13. On Mon, 12 Apr 2004 07:22:26 -0400, Peter Hansen <> wrote:

    >Peter Hansen wrote:
    >
    >> Josiah Carlson wrote:
    >>
    >>> Checking the memory usage of Python before and after the creation of 1
    >>> million integer list, I get a /very/ consistant ~16 bytes per. That
    >>> would be 12 bytes for the object, and 4 bytes for each pointer in the
    >>> list.

    >>
    >> Please post the expression you are using, for comparison.

    >
    >By the way, in case it wasn't clear, we have a disagreement (and perhaps
    >a misunderstanding) only about the meaning of the original code posted.
    >You believe it is allocating many integers. I believe it allocates
    >only a single one. Perhaps one of us is wrong. I hadn't read the code
    >in great detail, and still haven't, and perhaps that's my mistake.
    >Depending on your reply, I'll actually go back and look at it closely
    >and perhaps execute it myself and see the result (in terms of the list
    >produced, not in terms of the memory consumption). Maybe you will
    >do the same and we'll identify which of us is mistaken...
    >
    >>> dict([(id(x),x) for x in [1]*100])

    {7946208: 1}
    >>> dict([(id(x),x) for x in range(5)])

    {7939088: 0, 7946208: 1, 7943184: 4, 7945200: 2, 7944192: 3}
    >>> dict([(id(x),x) for x in range(5)*10])

    {7939088: 0, 7946208: 1, 7943184: 4, 7945200: 2, 7944192: 3}

    >>> for v in -6,-5,0,1,99,100:

    ... print dict([(id(x),x) for x in [v for i in xrange(5)]])
    ...
    {8053508: -6}
    {7936720: -5}
    {7939088: 0}
    {7946208: 1}
    {8042496: 99}
    {8166456: 100}
    >>> for v in -6,-5,0,1,99,100:

    ... print dict([(id(x),x) for x in [v*1 for i in xrange(5)]])
    ...
    {8166384: -6, 8166444: -6, 8166492: -6, 8166360: -6, 8166468: -6}
    {7936720: -5}
    {7939088: 0}
    {7946208: 1}
    {8042496: 99}
    {8166360: 100, 8166468: 100, 8166480: 100, 8166348: 100, 8166492: 100}

    Regards,
    Bengt Richter
     
    Bengt Richter, Apr 12, 2004
    #13
  14. Lupe

    Lupe Guest

    thank you everyone! eom

    ....
     
    Lupe, Apr 13, 2004
    #14
  15. Lupe

    Steve Guest

    Lupe wrote:

    > I just would like to know if there is any limit to a list or tuple.


    Yes. Since the universe only contains something like
    10**85 particles of matter, the upper limit to a list
    is roughly 2**(10**85) bits.

    Sorry, I couldn't resist :-D

    You can store as many things in a list as you have
    memory for. I suppose there is probably some
    fundamental limit due to 32 bit addressing issues, but
    are you sure you are really storing so many items you
    are hitting that limit? I'd guess you aren't.

    For example, the other day I was playing around with
    Python, and just for a lark I created a list with
    300,000,000 instances of None. It took about five
    minutes, but it worked :)

    Going back to your original problem:

    > Each time the function is called, it compares an
    > element of a list (which is a list too) to other
    > elements, which all amounts to millions of
    > comparisons.


    Are you sure there isn't a more efficient algorithm for
    what you are trying to do?

    > The program is working for short lists but not for
    > longer ones.


    What do you mean it is working? It doesn't finish? It
    is too slow? Or it gives the wrong answer?

    If it is giving a wrong answer, there is a bug in your
    code. If it is running too slowly, chances are there is
    a better algorithm that will fix it.

    --
    Steven D'Aprano
     
    Steve, Apr 14, 2004
    #15
  16. Lupe

    Steve Guest

    Lupe wrote:

    > I just would like to know if there is any limit to a list or tuple.


    Yes. Since the universe only contains something like
    10**85 particles of matter, the upper limit to a list
    is roughly 2**(10**85) bits.

    Sorry, I couldn't resist :-D

    You can store as many things in a list as you have
    memory for. I suppose there is probably some
    fundamental limit due to 32 bit addressing issues, but
    are you sure you are really storing so many items you
    are hitting that limit? I'd guess you aren't.

    For example, the other day I was playing around with
    Python, and just for a lark I created a list with
    300,000,000 instances of None. It took about five
    minutes, but it worked :)

    Going back to your original problem:

    > Each time the function is called, it compares an
    > element of a list (which is a list too) to other
    > elements, which all amounts to millions of
    > comparisons.


    Are you sure there isn't a more efficient algorithm for
    what you are trying to do?

    > The program is working for short lists but not for
    > longer ones.


    What do you mean it is working? It doesn't finish? It
    is too slow? Or it gives the wrong answer?

    If it is giving a wrong answer, there is a bug in your
    code. If it is running too slowly, chances are there is
    a better algorithm that will fix it.

    --
    Steven D'Aprano
     
    Steve, Apr 14, 2004
    #16
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michal Mikolajczyk
    Replies:
    1
    Views:
    813
    Larry Bates
    Apr 20, 2004
  2. Jeff Epler
    Replies:
    0
    Views:
    972
    Jeff Epler
    Apr 20, 2004
  3. Davy
    Replies:
    3
    Views:
    1,894
    Wildemar Wildenburger
    Nov 7, 2007
  4. Jeff Nyman
    Replies:
    8
    Views:
    390
    Terry Reedy
    Jun 5, 2008
  5. TP
    Replies:
    9
    Views:
    282
    Glenn Linderman
    Jan 8, 2009
Loading...

Share This Page