mmap 2GB allocation limit on Win XP, 32-bits, Python 2.5.4

Discussion in 'Python' started by Slaunger, Jul 24, 2009.

  1. Slaunger

    Slaunger Guest

    OS: Win XP SP3, 32 bit
    Python 2.5.4

    Hi I have run into some problems with allocating numpy.memmaps
    exceeding and accumulated size of about 2 GB. I have found out that
    the real problem relates to numpy.memmap using mmap.mmap

    I've written a small test program to illustrate it:

    import itertools
    import mmap
    import os

    files = []
    mmaps = []
    file_names= []
    mmap_cap=0
    bytes_per_mmap = 100 * 1024 ** 2
    try:
    for i in itertools.count(1):
    file_name = "d:/%d.tst" % i
    file_names.append(file_name)
    f = open(file_name, "w+b")
    files.append(f)
    mm = mmap.mmap(f.fileno(), bytes_per_mmap)
    mmaps.append(mm)
    mmap_cap += bytes_per_mmap
    print "Created %d writeable mmaps containing %d MB" % (i,
    mmap_cap/(1024**2))

    #Clean up
    finally:
    print "Removing mmaps..."
    for mm, f, file_name in zip(mmaps, files, file_names):
    mm.close()
    f.close()
    os.remove(file_name)
    print "Done..."


    which creates this output

    Created 1 writeable mmaps containing 100 MB
    Created 2 writeable mmaps containing 200 MB
    .....
    Created 17 writeable mmaps containing 1700 MB
    Created 18 writeable mmaps containing 1800 MB
    Removing mmaps...
    Done...
    Traceback (most recent call last):
    File "C:\svn-sandbox\research\scipy\scipy\src\com\terma\kha
    \mmaptest.py", line 16, in <module>
    mm = mmap.mmap(f.fileno(), bytes_per_mmap)
    WindowsError: [Error 8] Not enough storage is available to process
    this command

    There is more than 25 GB of free space on drive d: at this stage.

    Is it a bug or a "feature" of the 32 bit OS?

    I am surprised about it as I have not found any notes about these
    kinds of limitations in the documentation.

    I am in dire need of these large memmaps for my task, and it is not an
    option to change OS due to other constraints in the system.

    Is there anything I can do about it?

    Best wishes,
    Kim
     
    Slaunger, Jul 24, 2009
    #1
    1. Advertising

  2. Slaunger schrieb:
    > OS: Win XP SP3, 32 bit
    > Python 2.5.4
    >
    > Hi I have run into some problems with allocating numpy.memmaps
    > exceeding and accumulated size of about 2 GB. I have found out that
    > the real problem relates to numpy.memmap using mmap.mmap
    >
    > I've written a small test program to illustrate it:
    >
    > import itertools
    > import mmap
    > import os
    >
    > files = []
    > mmaps = []
    > file_names= []
    > mmap_cap=0
    > bytes_per_mmap = 100 * 1024 ** 2
    > try:
    > for i in itertools.count(1):
    > file_name = "d:/%d.tst" % i
    > file_names.append(file_name)
    > f = open(file_name, "w+b")
    > files.append(f)
    > mm = mmap.mmap(f.fileno(), bytes_per_mmap)
    > mmaps.append(mm)
    > mmap_cap += bytes_per_mmap
    > print "Created %d writeable mmaps containing %d MB" % (i,
    > mmap_cap/(1024**2))
    >
    > #Clean up
    > finally:
    > print "Removing mmaps..."
    > for mm, f, file_name in zip(mmaps, files, file_names):
    > mm.close()
    > f.close()
    > os.remove(file_name)
    > print "Done..."
    >
    >
    > which creates this output
    >
    > Created 1 writeable mmaps containing 100 MB
    > Created 2 writeable mmaps containing 200 MB
    > ....
    > Created 17 writeable mmaps containing 1700 MB
    > Created 18 writeable mmaps containing 1800 MB
    > Removing mmaps...
    > Done...
    > Traceback (most recent call last):
    > File "C:\svn-sandbox\research\scipy\scipy\src\com\terma\kha
    > \mmaptest.py", line 16, in <module>
    > mm = mmap.mmap(f.fileno(), bytes_per_mmap)
    > WindowsError: [Error 8] Not enough storage is available to process
    > this command
    >
    > There is more than 25 GB of free space on drive d: at this stage.
    >
    > Is it a bug or a "feature" of the 32 bit OS?


    It's a limitation, yes. That's what 64-bit-OSes are for.

    > I am surprised about it as I have not found any notes about these
    > kinds of limitations in the documentation.
    >
    > I am in dire need of these large memmaps for my task, and it is not an
    > option to change OS due to other constraints in the system.
    >
    > Is there anything I can do about it?


    Only by partitioning data yourself, and accessing these partitions. Like
    in the good old days of DOS-programming.

    Diez
     
    Diez B. Roggisch, Jul 24, 2009
    #2
    1. Advertising

  3. >>>>> Slaunger <> (S) wrote:

    >S> OS: Win XP SP3, 32 bit
    >S> Python 2.5.4


    >S> Hi I have run into some problems with allocating numpy.memmaps
    >S> exceeding and accumulated size of about 2 GB. I have found out that
    >S> the real problem relates to numpy.memmap using mmap.mmap


    On Windows XP the virtual address space of a process is limited to 2 GB
    unless the /3GB switch is used in the Boot.ini file.
    http://www.microsoft.com/whdc/system/platform/server/PAE/PAEmem.mspx
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Jul 24, 2009
    #3
  4. Slaunger

    Dave Angel Guest

    Slaunger wrote:
    > OS: Win XP SP3, 32 bit
    > Python 2.5.4
    >
    > Hi I have run into some problems with allocating numpy.memmaps
    > exceeding and accumulated size of about 2 GB. I have found out that
    > the real problem relates to numpy.memmap using mmap.mmap
    >
    > I've written a small test program to illustrate it:
    >
    > import itertools
    > import mmap
    > import os
    >
    > files = []
    > mmaps = []
    > file_names= []
    > mmap_cap=0
    > bytes_per_mmap = 100 * 1024 ** 2
    > try:
    > for i in itertools.count(1):
    > file_name = "d:/%d.tst" % i
    > file_names.append(file_name)
    > f = open(file_name, "w+b")
    > files.append(f)
    > mm = mmap.mmap(f.fileno(), bytes_per_mmap)
    > mmaps.append(mm)
    > mmap_cap += bytes_per_mmap
    > print "Created %d writeable mmaps containing %d MB" % (i,
    > mmap_cap/(1024**2))
    >
    > #Clean up
    > finally:
    > print "Removing mmaps..."
    > for mm, f, file_name in zip(mmaps, files, file_names):
    > mm.close()
    > f.close()
    > os.remove(file_name)
    > print "Done..."
    >
    >
    > which creates this output
    >
    > Created 1 writeable mmaps containing 100 MB
    > Created 2 writeable mmaps containing 200 MB
    > ....
    > Created 17 writeable mmaps containing 1700 MB
    > Created 18 writeable mmaps containing 1800 MB
    > Removing mmaps...
    > Done...
    > Traceback (most recent call last):
    > File "C:\svn-sandbox\research\scipy\scipy\src\com\terma\kha
    > \mmaptest.py", line 16, in <module>
    > mm = mmap.mmap(f.fileno(), bytes_per_mmap)
    > WindowsError: [Error 8] Not enough storage is available to process
    > this command
    >
    > There is more than 25 GB of free space on drive d: at this stage.
    >
    > Is it a bug or a "feature" of the 32 bit OS?
    >
    > I am surprised about it as I have not found any notes about these
    > kinds of limitations in the documentation.
    >
    > I am in dire need of these large memmaps for my task, and it is not an
    > option to change OS due to other constraints in the system.
    >
    > Is there anything I can do about it?
    >
    > Best wishes,
    > Kim
    >
    >

    It's not a question of how much disk space there is, but how much
    virtual space 32 bits can address. 2**32 is about 4 gig, and Windows XP
    reserves about half of that for system use. Presumably a 64 bit OS
    would have a much larger limit.

    Years ago I worked on Sun Sparc system which had much more limited
    shared memory access, due to hardware limitations. So 2gig seems pretty
    good to me.

    There is supposed to be a way to tell the Windows OS to only use 1 gb of
    virtual space, leaving 3gb for application use. But there are some
    limitations, and I don't recall what they are. I believe it has to be
    done globally (probably in Boot.ini), rather than per process. And some
    things didn't work in that configuration.

    DaveA
     
    Dave Angel, Jul 24, 2009
    #4
  5. Slaunger

    Dave Angel Guest

    (forwarding this message, as the reply was off-list)
    Kim Hansen wrote:
    > 2009/7/24 Dave Angel <>:
    >
    >> It's not a question of how much disk space there is, but how much virtual
    >> space 32 bits can address. 2**32 is about 4 gig, and Windows XP reserves
    >> about half of that for system use. Presumably a 64 bit OS would have a much
    >> larger limit.
    >>
    >> Years ago I worked on Sun Sparc system which had much more limited shared
    >> memory access, due to hardware limitations. So 2gig seems pretty good to
    >> me.
    >>
    >> There is supposed to be a way to tell the Windows OS to only use 1 gb of
    >> virtual space, leaving 3gb for application use. But there are some
    >> limitations, and I don't recall what they are. I believe it has to be done
    >> globally (probably in Boot.ini), rather than per process. And some things
    >> didn't work in that configuration.
    >>
    >> DaveA
    >>
    >>
    >>

    > Hi Dave,
    >
    > In the related post I did on the numpy discussions:
    >
    > http://article.gmane.org/gmane.comp.python.numeric.general/31748
    >
    > another user was kind enough to run my test program on both 32 bit and
    > 64 bit machines. On the 64 bit machine, there was no such limit, very
    > much in line with what you wrote. Adding the /3GB option in boot.ini
    > did not increase the available memory as well. Apparently, Python
    > needs to have been compiled in a way, which makes it possible to take
    > advantage of that switch and that is either not the case or I did
    > something else wrong as well.
    >
    > I acknowledge the explanation concerning the address space available.
    > Being an ignorant of the inner details of the implementation of mmap,
    > it seems like somewhat an "implementation detail" to me that such an
    > address wall is hit. There may be some good arguments from a
    > programming point of view and it may be a relative high limit as
    > compared to other systems but it is certainly at the low side for my
    > application: I work with data files typically 200 GB in size
    > consisting of datapackets each having a fixed size frame and a
    > variable size payload. To handle these large files, I generate an
    > "index" file consisting of just the frames (which has all the metadata
    > I need for finding the payloads I am interested in) and "pointers" to
    > where in the large data file each payload begins. This index file can
    > be up to 1 GB in size and at times I need to have access to two of
    > those at the same time (and then i hit the address wall). I would
    > really really like to be able to access these index files in a
    > read-only manner as an array of records on a file for which I use
    > numpy.memmap (which wraps mmap.mmap) such that I can pick a single
    > element, extract, e.g., every thousand value of a specific field in
    > the record using the convenient indexing available in Python/numpy.
    > Now it seems like I have to resort to making my own encapsulation
    > layer, which seeks to the relevant place in the file, reads sections
    > as bytestrings into recarrays, etc. Well, I must just get on with
    > it...
    >
    > I think it would be worthwhile specifying this 32 bit OS limitation in
    > the documentation of mmap.mmap, as I doubt I am the only one being
    > surprised about this address space limitation.
    >
    > Cheers,
    > Kim
    >
    >

    I agree that some description of system limitations should be included
    in a system-specific document. There probably is one, I haven't looked
    recently. But I don't think it belongs in mmap documentation.

    Perhaps you still don't recognize what the limit is. 32 bits can only
    address 4 gigabytes of things as first-class addresses. So roughly the
    same limit that's on mmap is also on list, dict, bytearray, or anything
    else. If you had 20 lists taking 100 meg each, you would fill up
    memory. If you had 10 of them, you might have enough room for a 1gb
    mmap area. And your code takes up some of that space, as well as the
    Python interpreter, the standard library, and all the data structures
    that are normally ignored by the application developer.

    BTW, there is one difference between mmap and most of the other
    allocations. Most data is allocated out of the swapfile, while mmap is
    allocated from the specified file (unless you use -1 for fileno).
    Consequently, if the swapfile is already clogged with all the other
    running applications, you can still take your 1.8gb or whatever of your
    virtual space, when much less than that might be available for other
    kinds of allocations.

    Executables and dlls are also (mostly) mapped into memory just the same
    as mmap. So they tend not to take up much space from the swapfile. In
    fact, with planning, a DLL needn't take up any swapfile space (well, a
    few K is always needed, realistically).. But that's a linking issue for
    compiled languages.

    DaveA
     
    Dave Angel, Jul 27, 2009
    #5
  6. Slaunger

    Slaunger Guest

    On 27 Jul., 13:21, Dave Angel <> wrote:
    > (forwarding this message, as the reply was off-list)
    >
    >
    >
    > Kim Hansen wrote:
    > > 2009/7/24 Dave Angel <>:

    >
    > >> It's not a question of how much disk space there is, but how much virtual
    > >> space 32 bits can address.  2**32 is about 4 gig, and Windows XP reserves
    > >> about half of that for system use.  Presumably a 64 bit OS would have a much
    > >> larger limit.

    >
    > >> Years ago I worked on Sun Sparc system which had much more limited shared
    > >> memory access, due to hardware limitations.  So 2gig seems pretty good to
    > >> me.

    >
    > >> There is supposed to be a way to tell the Windows OS to only use 1 gb of
    > >> virtual space, leaving 3gb for application use.  But there are some
    > >> limitations, and I don't recall what they are.  I believe it has to be done
    > >> globally (probably in Boot.ini), rather than per process.  And some things
    > >> didn't work in that configuration.

    >
    > >> DaveA

    >
    > > Hi Dave,

    >
    > > In the related post I did on the numpy discussions:

    >
    > >http://article.gmane.org/gmane.comp.python.numeric.general/31748

    >
    > > another user was kind enough to run my test program on both 32 bit and
    > > 64 bit machines. On the 64 bit machine, there was no such limit, very
    > > much in line with what you wrote. Adding the /3GB option in boot.ini
    > > did not increase the available memory as well. Apparently, Python
    > > needs to have been compiled in a way, which makes it possible to take
    > > advantage of that switch and that is either not the case or I did
    > > something else wrong as well.

    >
    > > I acknowledge the explanation concerning the address space available.
    > > Being an ignorant of the inner details of the implementation of mmap,
    > > it seems like somewhat an "implementation detail" to me that such an
    > > address wall is hit. There may be some good arguments from a
    > > programming point of view and it may be a relative high limit as
    > > compared to other systems but it is certainly at the low side for my
    > > application: I work with data files typically 200 GB in size
    > > consisting of datapackets each having a fixed size frame and a
    > > variable size payload. To handle these large files, I generate an
    > > "index" file consisting of just the frames (which has all the metadata
    > > I need for finding the payloads I am interested in) and "pointers" to
    > > where in the large data file each payload begins. This index file can
    > > be up to 1 GB in size and at times I need to have access to two of
    > > those at the same time (and then i hit the address wall). I would
    > > really really like to be able to access these index files in a
    > > read-only manner as an array of records on a file for which I use
    > > numpy.memmap (which wraps mmap.mmap) such that I can pick a single
    > > element, extract, e.g., every thousand value of a specific field in
    > > the record using the convenient indexing available in Python/numpy.
    > > Now it seems like I have to resort to making my own encapsulation
    > > layer, which seeks to the relevant place in the file, reads sections
    > > as bytestrings into recarrays, etc. Well, I must just get on with
    > > it...

    >
    > > I think it would be worthwhile specifying this 32 bit OS limitation in
    > > the documentation of mmap.mmap, as I doubt I am the only one being
    > > surprised about this address space limitation.

    >
    > > Cheers,
    > > Kim

    >
    > I agree that some description of system limitations should be included
    > in a system-specific document.  There probably is one, I haven't looked
    > recently.  But I don't think it belongs in mmap documentation.
    >
    > Perhaps you still don't recognize what the limit is.  32 bits can only
    > address 4 gigabytes of things as first-class addresses.  So roughly the
    > same limit that's on mmap is also on list, dict, bytearray, or anything
    > else.  If you had 20 lists taking 100 meg each, you would fill up
    > memory.  If you had 10 of them, you might have enough room for a 1gb
    > mmap area.  And your code takes up some of that space, as well as the
    > Python interpreter, the standard library, and all the data structures
    > that are normally ignored by the application developer.
    >
    > BTW,  there is one difference between mmap and most of the other
    > allocations.  Most data is allocated out of the swapfile, while mmap is
    > allocated from the specified file (unless you use -1 for fileno).  
    > Consequently, if the swapfile is already clogged with all the other
    > running applications, you can still take your 1.8gb or whatever of your
    > virtual space, when much less than that might be available for other
    > kinds of allocations.
    >
    > Executables and dlls are also (mostly) mapped into memory just the same
    > as mmap.  So they tend not to take up much space from the swapfile.  In
    > fact, with planning, a DLL needn't take up any swapfile space (well, a
    > few K is always needed, realistically)..  But that's a linking issue for
    > compiled languages.
    >
    > DaveA- Skjul tekst i anførselstegn -
    >
    > - Vis tekst i anførselstegn -


    I do understand the 2 GB address space limitation. However, I think I
    have found a solution to my original numpy.memmap problem (which spun
    off to this problem), and that is PyTables, where I can address 2^64
    data on a 32 bit machine using hd5 files and thus circumventing the
    "implementation detail" of the intermedia 2^32 memory address problem
    in the numpy.memmap/mmap.mmap implementation.

    http://www.pytables.org/moin

    I just watched the first tutorial video, and that seems like just what
    I am after (if it works as well in practise at it appears to do).

    http://showmedo.com/videos/video?name=1780000&fromSeriesID=178

    Cheers,
    Kim
     
    Slaunger, Jul 27, 2009
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Iblix
    Replies:
    4
    Views:
    3,516
    Gary Milton
    Apr 16, 2004
  2. GGG
    Replies:
    10
    Views:
    12,640
    Donar
    Jul 6, 2006
  3. Ringwraith
    Replies:
    4
    Views:
    957
    Ringwraith
    Jan 27, 2004
  4. Krist
    Replies:
    6
    Views:
    771
    Arne Vajhøj
    May 7, 2010
  5. zigzagdna
    Replies:
    22
    Views:
    1,677
    Gene Wirchenko
    Nov 9, 2011
Loading...

Share This Page