Sequential Object Store

Discussion in 'Python' started by GZ, Aug 7, 2010.

  1. GZ

    GZ Guest

    Hi All,

    I need to store a large number of large objects to file and then
    access them sequentially. I am talking about a few thousands of
    objects and each with size of a few hundred kilobytes, and total file
    size a few gigabytes. I tried shelve, but it is not good at
    sequentially accessing the data. In essence, shelve.keys() takes
    forever.

    I am wondering if there is a module that can persist a stream of
    objects without having to load everything into memory. (For this
    reason, I think Pickle is out, too, because it needs everything to be
    in memory.)

    Thanks,
    GZ
     
    GZ, Aug 7, 2010
    #1
    1. Advertising

  2. GZ

    Alex Willmer Guest

    On Aug 7, 5:26 pm, GZ <> wrote:
    > I am wondering if there is a module that can persist a stream of
    > objects without having to load everything into memory. (For this
    > reason, I think Pickle is out, too, because it needs everything to be
    > in memory.)


    From the pickle docs it looks like you could do something like:

    try:
    import cPickle as pickle
    except ImportError
    import pickle

    file_obj = open('whatever', 'wb')
    p = pickle.Pickler(file_obj)

    for x in stream_of_objects:
    p.dump(x)
    p.memo.clear()

    del p
    file_obj.close()

    then later

    file_obj = open('whatever', 'rb')
    p = pickle.Unpickler(file_obj)

    while True:
    try:
    x = p.load()
    do_something_with(x)
    except EOFError:
    break

    Your loading loop could be wrapped in a generator function, so only
    one object should be held in memory at once.
     
    Alex Willmer, Aug 8, 2010
    #2
    1. Advertising

  3. GZ

    GZ Guest

    Hi Alex,

    On Aug 7, 6:54 pm, Alex Willmer <> wrote:
    > On Aug 7, 5:26 pm, GZ <> wrote:
    >
    > > I am wondering if there is a module that can persist a stream of
    > > objects without having to load everything into memory. (For this
    > > reason, I think Pickle is out, too, because it needs everything to be
    > > in memory.)

    >
    > From the pickle docs it looks like you could do something like:
    >
    > try:
    >     import cPickle as pickle
    > except ImportError
    >     import pickle
    >
    > file_obj = open('whatever', 'wb')
    > p = pickle.Pickler(file_obj)
    >
    > for x in stream_of_objects:
    >     p.dump(x)
    >     p.memo.clear()
    >
    > del p
    > file_obj.close()
    >
    > then later
    >
    > file_obj = open('whatever', 'rb')
    > p = pickle.Unpickler(file_obj)
    >
    > while True:
    >     try:
    >         x = p.load()
    >         do_something_with(x)
    >     except EOFError:
    >         break
    >
    > Your loading loop could be wrapped in a generator function, so only
    > one object should be held in memory at once.


    This totally works!

    Thanks!
     
    GZ, Aug 9, 2010
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. john

    Sequential Machines

    john, Oct 28, 2004, in forum: VHDL
    Replies:
    6
    Views:
    626
    rickman
    Nov 5, 2004
  2. Taras_96
    Replies:
    5
    Views:
    2,167
    Taras_96
    Apr 14, 2005
  3. john
    Replies:
    7
    Views:
    694
  4. =?iso-8859-1?b?Sulq6Q==?=

    [EXCEL] sequential =?iso-8859-1?q?acc=E8s=2E?=

    =?iso-8859-1?b?Sulq6Q==?=, Jan 23, 2004, in forum: Perl
    Replies:
    0
    Views:
    715
    =?iso-8859-1?b?Sulq6Q==?=
    Jan 23, 2004
  5. =?Utf-8?B?UnVkeQ==?=

    to store or not to store an image

    =?Utf-8?B?UnVkeQ==?=, Mar 29, 2005, in forum: ASP .Net
    Replies:
    6
    Views:
    645
    =?Utf-8?B?UnVkeQ==?=
    Mar 30, 2005
Loading...

Share This Page