CPython loading modules into memory

Discussion in 'Python' started by sjbrown, Feb 11, 2009.

  1. sjbrown

    sjbrown Guest

    Can someone describe the details of how Python loads modules into
    memory? I assume once the .py file is compiled to .pyc that it is
    mmap'ed in. But that assumption is very naive. Maybe it uses an
    anonymous mapping? Maybe it does other special magic? This is all
    very alien to me, so if someone could explain it in terms that a
    person who never usually worries about memory could understand, that
    would be much appreciated.

    Follow up: is this process different if the modules are loaded from a
    zipfile?

    If there is a link that covers this info, that'd be great too.
     
    sjbrown, Feb 11, 2009
    #1
    1. Advertising

  2. sjbrown

    Robert Kern Guest

    On 2009-02-11 15:30, sjbrown wrote:
    > Can someone describe the details of how Python loads modules into
    > memory? I assume once the .py file is compiled to .pyc that it is
    > mmap'ed in. But that assumption is very naive. Maybe it uses an
    > anonymous mapping? Maybe it does other special magic? This is all
    > very alien to me, so if someone could explain it in terms that a
    > person who never usually worries about memory could understand, that
    > would be much appreciated.


    Python will read the .py file and compile into bytecode. This bytecode will be
    written out to a .pyc file as a caching mechanism; if the .pyc exists and has a
    newer timestamp than the .py file, the .py file will not be read or compiled.
    The bytecode will simply be read from the .pyc file.

    The .pyc file is not really a map of a memory structure, per se. It is simply a
    disk representation of the bytecode. This bytecode is *executed* by the Python
    VM to populate a module's namespace (which is just a dict) in memory. I believe
    it is then discarded.

    > Follow up: is this process different if the modules are loaded from a
    > zipfile?


    Only in that the .py or .pyc will be extracted from the zipfile instead of being
    read directly from disk.

    --
    Robert Kern

    "I have come to believe that the whole world is an enigma, a harmless enigma
    that is made terrible by our own mad attempt to interpret it as though it had
    an underlying truth."
    -- Umberto Eco
     
    Robert Kern, Feb 11, 2009
    #2
    1. Advertising

  3. > Can someone describe the details of how Python loads modules into
    > memory? I assume once the .py file is compiled to .pyc that it is
    > mmap'ed in. But that assumption is very naive. Maybe it uses an
    > anonymous mapping? Maybe it does other special magic? This is all
    > very alien to me, so if someone could explain it in terms that a
    > person who never usually worries about memory could understand, that
    > would be much appreciated.


    There is no magic whatsoever. Python opens a sequential file descriptor
    for the .pyc file, and then reads it in small chunks, "unmarshalling"
    it (indeed, the marshal module is used to restore Python objects).

    The marshal format is an object serialization in a type-value encoding
    (sometimes type-length-value), with type codes for:
    - None, True, False
    - 32-bit ints, 64-bit ints (unmarshalled into int/long)
    - floats, complex
    - arbitrary-sized longs
    - strings, unicode
    - tuples (length + marshal data of values)
    - lists
    - dicts
    - code objects
    - a few others

    Result of unmarshalling is typically a code object.

    > Follow up: is this process different if the modules are loaded from a
    > zipfile?


    No; it uncompresses into memory, and then unmarshals from there (
    compressed block for compressed block)

    > If there is a link that covers this info, that'd be great too.


    See the description of the marshal module.

    HTH,
    Martin
     
    Martin v. Löwis, Feb 11, 2009
    #3
  4. sjbrown

    sjbrown Guest

    On Feb 11, 2:00 pm, "Martin v. Löwis" <> wrote:
    > > Can someone describe the details of how Python loads modules into
    > > memory?  I assume once the .py file is compiled to .pyc that it is
    > > mmap'ed in.  But that assumption is very naive.  Maybe it uses an
    > > anonymous mapping?  Maybe it does other special magic?  This is all
    > > very alien to me, so if someone could explain it in terms that a
    > > person who never usually worries about memory could understand, that
    > > would be much appreciated.

    >
    > There is no magic whatsoever. Python opens a sequential file descriptor
    > for the .pyc file, and then reads it in small chunks, "unmarshalling"
    > it (indeed, the marshal module is used to restore Python objects).
    >
    > The marshal format is an object serialization in a type-value encoding
    > (sometimes type-length-value), with type codes for:
    > - None, True, False
    > - 32-bit ints, 64-bit ints (unmarshalled into int/long)
    > - floats, complex
    > - arbitrary-sized longs
    > - strings, unicode
    > - tuples (length + marshal data of values)
    > - lists
    > - dicts
    > - code objects
    > - a few others
    >
    > Result of unmarshalling is typically a code object.
    >
    > > Follow up: is this process different if the modules are loaded from a
    > > zipfile?

    >
    > No; it uncompresses into memory, and then unmarshals from there (
    > compressed block for compressed block)
    >
    > > If there is a link that covers this info, that'd be great too.

    >
    > See the description of the marshal module.
    >
    > HTH,
    > Martin



    Thanks for the answers. For my own edification, and in case anyone is
    interested, I confirmed this by looking at import.c and marshal.c in
    the Python2.5.4 source. Looks like the actual reading of the file is
    done in the marshal.c function PyMarshal_ReadLastObjectFromFile. It
    is read sequentially using a small buffer on the heap.

    -sjbrown
     
    sjbrown, Feb 12, 2009
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Robert Zurer
    Replies:
    0
    Views:
    479
    Robert Zurer
    Jun 28, 2005
  2. kubek
    Replies:
    27
    Views:
    2,408
    Chris Uppal
    Apr 23, 2006
  3. Philippe C. Martin
    Replies:
    1
    Views:
    275
    Thomas Guettler
    Dec 7, 2004
  4. ImpalerCore
    Replies:
    0
    Views:
    890
    ImpalerCore
    Mar 10, 2011
  5. Trans
    Replies:
    0
    Views:
    123
    Trans
    Jul 29, 2005
Loading...

Share This Page