How Python Implements "long integer"?

Discussion in 'Python' started by Pedram, Jul 5, 2009.

  1. Pedram

    Pedram Guest

    Hello,
    I'm reading about implementation of long ints in Python. I downloaded
    the source code of CPython and will read the longobject.c, but from
    where I should start reading this file? I mean which function is the
    first?
    Anyone can help?
    Thanks

    Pedram
    Pedram, Jul 5, 2009
    #1
    1. Advertising

  2. On Jul 5, 8:38 am, Pedram <> wrote:
    > Hello,
    > I'm reading about implementation of long ints in Python. I downloaded
    > the source code of CPython and will read the longobject.c, but from
    > where I should start reading this file? I mean which function is the
    > first?


    I don't really understand the question: what do you mean by 'first'?
    It might help if you tell us what your aims are.

    In any case, you probably also want to look at the Include/
    longintrepr.h and Include/longobject.h files.

    Mark
    Mark Dickinson, Jul 5, 2009
    #2
    1. Advertising

  3. Pedram

    Pedram Guest

    On Jul 5, 1:57 pm, Mark Dickinson <> wrote:
    > On Jul 5, 8:38 am, Pedram <> wrote:
    >
    > > Hello,
    > > I'm reading about implementation of long ints in Python. I downloaded
    > > the source code of CPython and will read the longobject.c, but from
    > > where I should start reading this file? I mean which function is the
    > > first?

    >
    > I don't really understand the question:  what do you mean by 'first'?
    > It might help if you tell us what your aims are.
    >
    > In any case, you probably also want to look at the Include/
    > longintrepr.h and Include/longobject.h files.
    >
    > Mark


    Thanks for reply,
    Sorry I can't explain too clear! I'm not English ;)
    But I want to understand the implementation of long int object in
    Python. How Python allocates memory and how it implements operations
    for this object?
    Although, I'm reading the source code (longobject.c and as you said,
    longintrepr.h and longobject.h) but if you can help me, I really
    appreciate that.

    Pedram
    Pedram, Jul 5, 2009
    #3
  4. On Jul 5, 1:09 pm, Pedram <> wrote:
    > Thanks for reply,
    > Sorry I can't explain too clear! I'm not English ;)


    That's shocking. Everyone should be English. :)

    > But I want to understand the implementation of long int object in
    > Python. How Python allocates memory and how it implements operations
    > for this object?


    I'd pick one operation (e.g., addition), and trace through the
    relevant functions in longobject.c. Look at the long_as_number
    table to see where to get started.

    In the case of addition, that table shows that the nb_add slot is
    given by long_add. long_add does any necessary type conversions
    (CONVERT_BINOP) and then calls either x_sub or x_add to do the real
    work.
    x_add calls _PyLong_New to allocate space for a new PyLongObject, then
    does the usual digit-by-digit-with-carry addition. Finally, it
    normalizes
    the result (removes any unnecessary zeros) and returns.

    As far as memory allocation goes: almost all operations call
    _PyLong_New at some point. (Except in py3k, where it's a bit more
    complicated because small integers are cached.)

    If you have more specific questions I'll have a go at answering them.

    Mark
    Mark Dickinson, Jul 5, 2009
    #4
  5. Pedram

    Pedram Guest

    On Jul 5, 5:04 pm, Mark Dickinson <> wrote:

    > That's shocking.  Everyone should be English. :)


    Yes, I'm trying :)

    > I'd pick one operation (e.g., addition), and trace through the
    > relevant functions in longobject.c.  Look at the long_as_number
    > table to see where to get started.
    >
    > In the case of addition, that table shows that the nb_add slot is
    > given by long_add.  long_add does any necessary type conversions
    > (CONVERT_BINOP) and then calls either x_sub or x_add to do the real
    > work.
    > x_add calls _PyLong_New to allocate space for a new PyLongObject, then
    > does the usual digit-by-digit-with-carry addition.  Finally, it
    > normalizes
    > the result (removes any unnecessary zeros) and returns.
    >
    > As far as memory allocation goes: almost all operations call
    > _PyLong_New at some point.  (Except in py3k, where it's a bit more
    > complicated because small integers are cached.)


    Oh, I didn't see long_as_number before. I'm reading it. That was very
    helpful, thanks.

    > If you have more specific questions I'll have a go at answering them.
    >
    > Mark


    Thank you a million.
    I will write your name in my "Specially thanks to" section of my
    article (In font size 72) ;)

    Pedram
    Pedram, Jul 5, 2009
    #5
  6. On Sun, Jul 5, 2009 at 04:57, Mark Dickinson<> wrote:
    > On Jul 5, 8:38 am, Pedram <> wrote:
    >> Hello,
    >> I'm reading about implementation of long ints in Python. I downloaded
    >> the source code of CPython and will read the longobject.c, but from
    >> where I should start reading this file? I mean which function is the
    >> first?

    >
    > I don't really understand the question:  what do you mean by 'first'?
    > It might help if you tell us what your aims are.


    I think he means the entry point, problem is that libraries have many.


    --
    Pablo Torres N.
    Pablo Torres N., Jul 5, 2009
    #6
  7. Pedram

    Pedram Guest

    Hello again,
    This time I have a simple C question!
    As you know, _PyLong_New returns the result of PyObject_NEW_VAR. I
    found PyObject_NEW_VAR in objimpl.h header file. But I can't
    understand the last line :( Here's the code:

    #define PyObject_NEW_VAR(type, typeobj, n) \
    ( (type *) PyObject_InitVar( \
    (PyVarObject *) PyObject_MALLOC(_PyObject_VAR_SIZE((typeobj),
    (n)) ),\
    (typeobj), (n)) )

    I know this will replace the PyObject_New_VAR(type, typeobj, n)
    everywhere in the code and but I can't understand the last line, which
    is just 'typeobj' and 'n'! What do they do? Are they make any sense in
    allocation process?
    Pedram, Jul 5, 2009
    #7
  8. Pedram

    Aahz Guest

    In article <>,
    Pedram <> wrote:
    >
    >This time I have a simple C question!
    >As you know, _PyLong_New returns the result of PyObject_NEW_VAR. I
    >found PyObject_NEW_VAR in objimpl.h header file. But I can't
    >understand the last line :( Here's the code:
    >
    >#define PyObject_NEW_VAR(type, typeobj, n) \
    >( (type *) PyObject_InitVar( \
    > (PyVarObject *) PyObject_MALLOC(_PyObject_VAR_SIZE((typeobj),
    >(n)) ),\
    > (typeobj), (n)) )
    >
    >I know this will replace the PyObject_New_VAR(type, typeobj, n)
    >everywhere in the code and but I can't understand the last line, which
    >is just 'typeobj' and 'n'! What do they do? Are they make any sense in
    >allocation process?


    Look in the code to find out what PyObject_InitVar() does -- and, more
    importantly, what its signature is. The clue you're missing is the
    trailing backslash on the third line, but that should not be required if
    you're using an editor that shows you matching parentheses.
    --
    Aahz () <*> http://www.pythoncraft.com/

    "as long as we like the same operating system, things are cool." --piranha
    Aahz, Jul 5, 2009
    #8
  9. Pedram

    Pedram Guest

    On Jul 5, 8:12 pm, (Aahz) wrote:
    > In article <..com>,
    >
    >
    >
    > Pedram  <> wrote:
    >
    > >This time I have a simple C question!
    > >As you know, _PyLong_New returns the result of PyObject_NEW_VAR. I
    > >found PyObject_NEW_VAR in objimpl.h header file. But I can't
    > >understand the last line :( Here's the code:

    >
    > >#define PyObject_NEW_VAR(type, typeobj, n) \
    > >( (type *) PyObject_InitVar( \
    > >      (PyVarObject *) PyObject_MALLOC(_PyObject_VAR_SIZE((typeobj),
    > >(n)) ),\
    > >      (typeobj), (n)) )

    >
    > >I know this will replace the PyObject_New_VAR(type, typeobj, n)
    > >everywhere in the code and but I can't understand the last line, which
    > >is just 'typeobj' and 'n'! What do they do? Are they make any sense in
    > >allocation process?

    >
    > Look in the code to find out what PyObject_InitVar() does -- and, more
    > importantly, what its signature is.  The clue you're missing is the
    > trailing backslash on the third line, but that should not be required if
    > you're using an editor that shows you matching parentheses.
    > --
    > Aahz ()           <*>        http://www.pythoncraft.com/
    >
    > "as long as we like the same operating system, things are cool." --piranha


    No, they wrapped the 3rd line!

    I'll show you the code in picture below:
    http://lh3.ggpht.com/_35nHfALLgC4/SlDVMEl6oOI/AAAAAAAAAKg/vPWA1gttvHM/s640/Screenshot.png

    As you can see the PyObject_MALLOC has nothing to do with typeobj and
    n in line 4.
    Pedram, Jul 5, 2009
    #9
  10. Pedram

    Pedram Guest

    On Jul 5, 8:32 pm, Pedram <> wrote:
    > On Jul 5, 8:12 pm, (Aahz) wrote:
    >
    >
    >
    > > In article <>,

    >
    > > Pedram  <> wrote:

    >
    > > >This time I have a simple C question!
    > > >As you know, _PyLong_New returns the result of PyObject_NEW_VAR. I
    > > >found PyObject_NEW_VAR in objimpl.h header file. But I can't
    > > >understand the last line :( Here's the code:

    >
    > > >#define PyObject_NEW_VAR(type, typeobj, n) \
    > > >( (type *) PyObject_InitVar( \
    > > >      (PyVarObject *) PyObject_MALLOC(_PyObject_VAR_SIZE((typeobj),
    > > >(n)) ),\
    > > >      (typeobj), (n)) )

    >
    > > >I know this will replace the PyObject_New_VAR(type, typeobj, n)
    > > >everywhere in the code and but I can't understand the last line, which
    > > >is just 'typeobj' and 'n'! What do they do? Are they make any sense in
    > > >allocation process?

    >
    > > Look in the code to find out what PyObject_InitVar() does -- and, more
    > > importantly, what its signature is.  The clue you're missing is the
    > > trailing backslash on the third line, but that should not be required if
    > > you're using an editor that shows you matching parentheses.
    > > --
    > > Aahz ()           <*>        http://www.pythoncraft.com/

    >
    > > "as long as we like the same operating system, things are cool." --piranha

    >
    > No, they wrapped the 3rd line!
    >
    > I'll show you the code in picture below:http://lh3.ggpht.com/_35nHfALLgC4/SlDVMEl6oOI/AAAAAAAAAKg/vPWA1gttvHM...
    >
    > As you can see the PyObject_MALLOC has nothing to do with typeobj and
    > n in line 4.


    Oooooh! What a mistake! I got it! they're Py_Object_InitVar
    parameters.
    Sorry and Thanks!
    Pedram, Jul 5, 2009
    #10
  11. Pedram

    Pedram Guest

    OK, fine, I read longobject.c at last! :)
    I found that longobject is a structure like this:

    struct _longobject {
    struct _object *_ob_next;
    struct _object *_ob_prev;
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
    digit ob_digit[1];
    }

    And a digit is a 15-item array of C's unsigned short integers.
    Am I right? Or I missed something! Is this structure is constant in
    all environments (Linux, Windows, Mobiles, etc.)?
    Pedram, Jul 6, 2009
    #11
  12. On Jul 6, 1:24 pm, Pedram <> wrote:
    > OK, fine, I read longobject.c at last! :)
    > I found that longobject is a structure like this:
    >
    > struct _longobject {
    >     struct _object *_ob_next;
    >     struct _object *_ob_prev;


    For current CPython, these two fields are only present in debug
    builds; for a normal build they won't exist.

    >     Py_ssize_t ob_refcnt;
    >     struct _typeobject *ob_type;


    You're missing an important field here (see the definition of
    PyObject_VAR_HEAD):

    Py_ssize_t ob_size; /* Number of items in variable part */

    For the current implementation of Python longs, the absolute value of
    this field gives the number of digits in the long; the sign gives the
    sign of the long (0L is represented with zero digits).

    >     digit ob_digit[1];


    Right. This is an example of the so-called 'struct hack' in C; it
    looks as though there's just a single digit, but what's intended here
    is that there's an array of digits tacked onto the end of the struct;
    for any given PyLongObject, the size of this array is determined at
    runtime. (C99 allows you to write this as simply ob_digit[], but not
    all compilers support this yet.)

    > }


    > And a digit is a 15-item array of C's unsigned short integers.


    No: a digit is a single unsigned short, which is used to store 15 bits
    of the Python long. Python longs are stored in sign-magnitude format,
    in base 2**15. So each of the base 2**15 'digits' is an integer in
    the range [0, 32767). The unsigned short type is used to store those
    digits.

    Exception: for Python 2.7+ or Python 3.1+, on 64-bit machines, Python
    longs are stored in base 2**30 instead of base 2**15, using a 32-bit
    unsigned integer type in place of unsigned short.

    > Is this structure is constant in
    > all environments (Linux, Windows, Mobiles, etc.)?


    I think it would be dangerous to rely on this struct staying constant,
    even just for CPython. It's entirely possible that the representation
    of Python longs could change in Python 2.8 or 3.2. You should use the
    public, documented C-API whenever possible.

    Mark
    Mark Dickinson, Jul 6, 2009
    #12
  13. Pedram

    Pedram Guest

    Hello Mr. Dickinson. Glad to see you again :)

    On Jul 6, 5:46 pm, Mark Dickinson <> wrote:
    > On Jul 6, 1:24 pm, Pedram <> wrote:
    >
    > > OK, fine, I read longobject.c at last! :)
    > > I found that longobject is a structure like this:

    >
    > > struct _longobject {
    > >     struct _object *_ob_next;
    > >     struct _object *_ob_prev;

    >
    > For current CPython, these two fields are only present in debug
    > builds;  for a normal build they won't exist.


    I couldn't understand the difference between them. What are debug
    build and normal build themselves? And You mean in debug build
    PyLongObject is a doubly-linked-list but in normal build it is just an
    array (Or if not how it'll store in this mode)?

    > >     Py_ssize_t ob_refcnt;
    > >     struct _typeobject *ob_type;

    >
    > You're missing an important field here (see the definition of
    > PyObject_VAR_HEAD):
    >
    >     Py_ssize_t ob_size; /* Number of items in variable part */
    >
    > For the current implementation of Python longs, the absolute value of
    > this field gives the number of digits in the long;  the sign gives the
    > sign of the long (0L is represented with zero digits).


    Oh, you're right. I missed that. Thanks :)

    > >     digit ob_digit[1];

    >
    > Right.  This is an example of the so-called 'struct hack' in C; it
    > looks as though there's just a single digit, but what's intended here
    > is that there's an array of digits tacked onto the end of the struct;
    > for any given PyLongObject, the size of this array is determined at
    > runtime.  (C99 allows you to write this as simply ob_digit[], but not
    > all compilers support this yet.)


    WOW! I didn't know anything about 'struct hacks'! I read about them
    and they were very wonderful. Thanks for your point. :)

    > > }
    > > And a digit is a 15-item array of C's unsigned short integers.

    >
    > No: a digit is a single unsigned short, which is used to store 15 bits
    > of the Python long.  Python longs are stored in sign-magnitude format,
    > in base 2**15.  So each of the base 2**15 'digits' is an integer in
    > the range [0, 32767).  The unsigned short type is used to store those
    > digits.
    >
    > Exception: for Python 2.7+ or Python 3.1+, on 64-bit machines, Python
    > longs are stored in base 2**30 instead of base 2**15, using a 32-bit
    > unsigned integer type in place of unsigned short.
    >
    > > Is this structure is constant in
    > > all environments (Linux, Windows, Mobiles, etc.)?

    >
    > I think it would be dangerous to rely on this struct staying constant,
    > even just for CPython.  It's entirely possible that the representation
    > of Python longs could change in Python 2.8 or 3.2.  You should use the
    > public, documented C-API whenever possible.
    >
    > Mark


    Thank you a lot Mark :)
    Pedram, Jul 6, 2009
    #13
  14. Mark Dickinson a écrit :
    > On Jul 5, 1:09 pm, Pedram <> wrote:
    >> Thanks for reply,
    >> Sorry I can't explain too clear! I'm not English ;)

    >
    > That's shocking. Everyone should be English. :)
    >

    Mark, tu sors !
    Bruno Desthuilliers, Jul 6, 2009
    #14
  15. Pedram

    Eric Wong Guest

    Pedram wrote:

    > Hello Mr. Dickinson. Glad to see you again :)
    >
    > On Jul 6, 5:46 pm, Mark Dickinson <> wrote:
    >> On Jul 6, 1:24 pm, Pedram <> wrote:
    >>
    >> > OK, fine, I read longobject.c at last! :)
    >> > I found that longobject is a structure like this:

    >>
    >> > struct _longobject {
    >> > struct _object *_ob_next;
    >> > struct _object *_ob_prev;

    >>
    >> For current CPython, these two fields are only present in debug
    >> builds; for a normal build they won't exist.

    >
    > I couldn't understand the difference between them. What are debug
    > build and normal build themselves? And You mean in debug build
    > PyLongObject is a doubly-linked-list but in normal build it is just

    an
    > array (Or if not how it'll store in this mode)?
    >

    we use the macro Py_TRACE_REFS to differ the code for debug build and
    normal build, that's to say, in debug build and normal build the codes
    are actually *different*. In debug build, not only PyLongObject but
    all Objects are linked by a doubly-linked-list and it can make the
    debug process less painful. But in normal build, objects are
    seperated! After an object is created, it will never be moved, so we
    can and should refer to an object only by it's address(pointer).
    There's no one-big-container like a list or an array for objects.
    Eric Wong, Jul 7, 2009
    #15
  16. On Jul 6, 4:13 pm, Pedram <> wrote:
    > On Jul 6, 5:46 pm, Mark Dickinson <> wrote:
    > > On Jul 6, 1:24 pm, Pedram <> wrote:

    >
    > > > OK, fine, I read longobject.c at last! :)
    > > > I found that longobject is a structure like this:

    >
    > > > struct _longobject {
    > > >     struct _object *_ob_next;
    > > >     struct _object *_ob_prev;

    >
    > > For current CPython, these two fields are only present in debug
    > > builds;  for a normal build they won't exist.

    >
    > I couldn't understand the difference between them. What are debug
    > build and normal build themselves? And You mean in debug build
    > PyLongObject is a doubly-linked-list but in normal build it is just an
    > array (Or if not how it'll store in this mode)?


    No: a PyLongObject is stored the same way (ob_size giving sign and
    number of digits, ob_digit giving the digits themselves) whether or
    not a debug build is in use.

    A debug build does various things (extra checks, extra information) to
    make it easier to track down problems. On Unix-like systems, you can
    get a debug build by configuring with the --with-pydebug flag.

    The _ob_next and _ob_prev fields have nothing particularly to do with
    Python longs; for a debug build, these two fields are added to *all*
    Python objects, and provide a doubly-linked list that links all 'live'
    Python objects together. I'm not really sure what, if anything, the
    extra information is used for within Python---it might be used by some
    external tools, I guess.

    Have you looked at the C-API documentation?

    http://docs.python.org/c-api/index.html

    _ob_next and _ob_prev are described here:

    http://docs.python.org/c-api/typeobj.html#_ob_next

    (These docs are for Python 2.6; I'm not sure what version you're
    working with.)

    Mark
    Mark Dickinson, Jul 7, 2009
    #16
  17. Pedram

    Pedram Guest

    On Jul 7, 1:10 pm, Mark Dickinson <> wrote:
    > On Jul 6, 4:13 pm, Pedram <> wrote:
    >
    >
    >
    > > On Jul 6, 5:46 pm, Mark Dickinson <> wrote:
    > > > On Jul 6, 1:24 pm, Pedram <> wrote:

    >
    > > > > OK, fine, I read longobject.c at last! :)
    > > > > I found that longobject is a structure like this:

    >
    > > > > struct _longobject {
    > > > >     struct _object *_ob_next;
    > > > >     struct _object *_ob_prev;

    >
    > > > For current CPython, these two fields are only present in debug
    > > > builds;  for a normal build they won't exist.

    >
    > > I couldn't understand the difference between them. What are debug
    > > build and normal build themselves? And You mean in debug build
    > > PyLongObject is a doubly-linked-list but in normal build it is just an
    > > array (Or if not how it'll store in this mode)?

    >
    > No:  a PyLongObject is stored the same way (ob_size giving sign and
    > number of digits, ob_digit giving the digits themselves) whether or
    > not a debug build is in use.
    >
    > A debug build does various things (extra checks, extra information) to
    > make it easier to track down problems.  On Unix-like systems, you can
    > get a debug build by configuring with the --with-pydebug flag.
    >
    > The _ob_next and _ob_prev fields have nothing particularly to do with
    > Python longs; for a debug build, these two fields are added to *all*
    > Python objects, and provide a doubly-linked list that links all 'live'
    > Python objects together.  I'm not really sure what, if anything, the
    > extra information is used for within Python---it might be used by some
    > external tools, I guess.
    >
    > Have you looked at the C-API documentation?
    >
    > http://docs.python.org/c-api/index.html
    >
    > _ob_next and _ob_prev are described here:
    >
    > http://docs.python.org/c-api/typeobj.html#_ob_next
    >
    > (These docs are for Python 2.6;  I'm not sure what version you're
    > working with.)
    >
    > Mark


    It seems there's an island named Python!
    Thanks for links, I'm on reading them.
    Pedram, Jul 7, 2009
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dan Pop
    Replies:
    0
    Views:
    1,168
    Dan Pop
    Jun 24, 2003
  2. cyberdude
    Replies:
    2
    Views:
    5,105
    Keith Thompson
    Jun 25, 2003
  3. Replies:
    7
    Views:
    930
    Maxim Yegorushkin
    Jun 15, 2006
  4. Sebastian Faust

    warning: use of C99 long long integer constant

    Sebastian Faust, Apr 1, 2008, in forum: C Programming
    Replies:
    23
    Views:
    3,159
    Philip Potter
    Apr 2, 2008
  5. Suresh V
    Replies:
    5
    Views:
    3,680
    SaticCaster
    Jul 5, 2010
Loading...

Share This Page