Segmenting a pickle stream without unpickling

Discussion in 'Python' started by Boris Borcic, May 19, 2006.

  1. Boris Borcic

    Boris Borcic Guest

    Assuming that the items of my_stream share no content (they are
    dumps of db cursor fetches), is there a simple way to do the
    equivalent of

    def pickles(my_stream) :
    from cPickle import load,dumps
    while 1 :
    yield dumps(load(my_stream))

    without the overhead associated with unpickling objects
    just to pickle them again ?

    TIA, Boris Borcic
     
    Boris Borcic, May 19, 2006
    #1
    1. Advertising

  2. Boris Borcic

    Paul Rubin Guest

    Boris Borcic <> writes:
    > def pickles(my_stream) :
    > from cPickle import load,dumps
    > while 1 :
    > yield dumps(load(my_stream))
    >
    > without the overhead associated with unpickling objects
    > just to pickle them again ?


    I think you'd have to write something special. The unpickler parses
    as it goes along, and all the dispatch actions build up objects.
    You'd have to write a set of actions that just read past the
    representations. I think there's no way to know where an object ends
    without parsing it, including parsing any objects nested inside it.
     
    Paul Rubin, May 19, 2006
    #2
    1. Advertising

  3. Boris Borcic

    Tim Peters Guest

    [Boris Borcic]
    > Assuming that the items of my_stream share no content (they are
    > dumps of db cursor fetches), is there a simple way to do the
    > equivalent of
    >
    > def pickles(my_stream) :
    > from cPickle import load,dumps
    > while 1 :
    > yield dumps(load(my_stream))
    >
    > without the overhead associated with unpickling objects
    > just to pickle them again ?


    cPickle (but not pickle.py) Unpickler objects have a barely documented
    noload() method. This "acts like" load(), except doesn't import
    modules or construct objects of user-defined classes. The return
    value of noload() is undocumented and usually useless. ZODB uses it a
    lot ;-)

    Anyway, that can go much faster than load(), and works even if the
    classes and modules referenced by pickles aren't available in the
    unpickling environment. It doesn't return the individual pickle
    strings, but they're easy to get at by paying attention to the file
    position between noload() calls. For example,

    import cPickle as pickle
    import os

    # Build a pickle file with 4 pickles.

    PICKLEFILE = "temp.pck"

    class C:
    pass

    f = open(PICKLEFILE, "wb")
    p = pickle.Pickler(f, 1)

    p.dump(2)
    p.dump([3, 4])
    p.dump(C())
    p.dump("all done")

    f.close()

    # Now use noload() to extract the 4 pickle
    # strings in that file.

    f = open(PICKLEFILE, "rb")
    limit = os.path.getsize(PICKLEFILE)
    u = pickle.Unpickler(f)
    pickles = []
    pos = 0
    while pos < limit:
    u.noload()
    thispos = f.tell()
    f.seek(pos)
    pickles.append(f.read(thispos - pos))
    pos = thispos

    from pprint import pprint
    pprint(pickles)

    That prints a list containing the 4 pickle strings:

    ['K\x02.',
    ']q\x01(K\x03K\x04e.',
    '(c__main__\nC\nq\x02o}q\x03b.',
    'U\x08all doneq\x04.']

    You could do much the same by calling pickletools.dis() and ignoring
    its output, but that's likely to be slower.
     
    Tim Peters, May 19, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. a pickle's pickle

    , Aug 2, 2005, in forum: Python
    Replies:
    4
    Views:
    407
  2. -dresden.de
    Replies:
    2
    Views:
    523
    Peter Otten
    Mar 12, 2008
  3. Michele Simionato
    Replies:
    2
    Views:
    1,936
    Michele Simionato
    May 23, 2008
  4. unpickling a stream

    , May 26, 2009, in forum: Python
    Replies:
    1
    Views:
    369
    ryles
    May 28, 2009
  5. Irmen de Jong
    Replies:
    3
    Views:
    1,373
    kerby2000
    Jan 26, 2011
Loading...

Share This Page