a pickle's pickle

Discussion in 'Python' started by temposs@gmail.com, Aug 2, 2005.

  1. Guest

    I'm trying to pickle a class, and while I get no errors or anything,
    almost none of the class instance gets pickled, and I don't know
    why...Here's the pickled output:

    (i__main__
    TrainingMatrix
    p0
    (dp1
    S'matrixWords'
    p2
    I4714
    sS'numWords'
    p3
    I4714
    sS'totalWordsProcessed'
    p4
    I46735
    sS'numContexts'
    p5
    I7664
    sS'estimator'
    p6
    (dp7
    sb.

    --End of output

    The class TrainingMatrix has no embedded classes and none of its
    methods have embedded methods. An instance of this class running in my
    program for about 10 minutes can build up on the order of 100MB in
    resident memory, but the output seems to be the same regardless of the
    data set size. The output seems to not even capture all of the member
    variables in the class. Here is the class code, abridged:

    class TrainingMatrix:
    matrix = []
    estimator = {}
    wordInfo = {}
    contextInfo = {}
    totalWordsProcessed = 0
    numWords = 0
    numContexts = 0
    matrixWords = 0

    def AddWordInfo(self,newWordInfo,newCapScheme):
    ...
    #End AddWordInfo

    def AddNewWord(self,newCapScheme):
    ...
    #End AddNewCapScheme

    def AddContext(self,newContext):
    ...
    #End AddContext

    def AddInstance(self,word,context):
    ...
    #End AddInstance

    def UpdateMatrix(self,wordIndex,contextIndex,isLowerCase):
    ...
    #End UpdateMatrix

    def PrintMatrix(self):
    ...
    #End PrintMatrix

    def EstimateLowerCase(self):
    ...
    #End GetNumWords

    def GetWordInfo(self,wordToFind):
    ...
    #End GetWordInfo

    def GetContext(self,wordList,direction):
    ...
    #End GetContext

    def GetBestCapScheme(self,wordInfo,precedeContext,followContext):
    ...
    #End GetBestCapScheme

    def IsLowerCase(self,word):
    ...
    #End IsLowerCase

    #End TrainingMatrix
    ###################

    And here is the pickling code:

    try:
    trainDB = open(trainDBString,"r+")
    except IOError:
    trainDB = open(trainDBString,"w")
    trainDB.close()
    trainDB = open(trainDBString,"r+")
    #End try
    ....
    try:
    trainerString = trainDB.read()
    trainer = loads(trainerString)

    except EOFError:
    trainer = TrainingMatrix()
    #End try
    ....
    trainerString = dumps(trainer)
    trainDB.write(trainerString)

    I've also tried a simple shelve implementation but got results similar
    to this, which is why I recoded to pickle, since it's lower level. Any
    help is appreciated :)

    -Andrew
    , Aug 2, 2005
    #1
    1. Advertising

  2. Magnus Lycka Guest

    wrote:

    > class TrainingMatrix:
    > matrix = []
    > estimator = {}
    > wordInfo = {}
    > contextInfo = {}
    > totalWordsProcessed = 0
    > numWords = 0
    > numContexts = 0
    > matrixWords = 0


    Is there some confusion between the scope of the class
    object and the scopes of the instance objects perhaps?

    Are you aware of this distinction? See below:

    >>> class X:

    .... m=0
    ....
    >>> X.m

    0
    >>> x=X()
    >>> x.m

    0
    >>> x.m += 5
    >>> x.m

    5
    >>> X.m

    0

    Are you pickling the class object or an instance?

    If you are pickling the class: Why? Is the data
    really in the class object?

    If you are pickling an instance:
    Is the data in the class object?
    Is the data in another instance object?
    Magnus Lycka, Aug 2, 2005
    #2
    1. Advertising

  3. Guest

    I intended to pickle the class instance I call trainer...from my code,
    also in the first post:

    trainerString = trainDB.read()
    trainer = loads(trainerString)


    except EOFError:
    trainer = TrainingMatrix()
    ....
    trainerString = dumps(trainer)
    ....

    So basically trainer always gets an existing TrainingMatrix(the class)
    pickled object if there is a file to read from, otherwise it just makes
    a new instance. Either way, the instance trainer is pickled at the
    end. Maybe I'm missing something...
    , Aug 2, 2005
    #3
  4. Benji York Guest

    wrote:
    > So basically trainer always gets an existing TrainingMatrix(the class)
    > pickled object if there is a file to read from, otherwise it just makes
    > a new instance. Either way, the instance trainer is pickled at the end.


    Right, but the data you're interested in is contained in the class, not
    the instance. You need to move the mutable class attributes into the
    instance. Like so:

    class TrainingMatrix:

    totalWordsProcessed = 0
    numWords = 0
    numContexts = 0
    matrixWords = 0

    def __init__(self):
    self.matrix = []
    self.estimator = {}
    self.wordInfo = {}
    self.contextInfo = {}
    --
    Benji York
    Benji York, Aug 2, 2005
    #4
  5. Guest

    Benji,
    Thanks so much, you have saved the day ^_^
    , Aug 2, 2005
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ben Finney

    Securing 'pickle'

    Ben Finney, Jul 11, 2003, in forum: Python
    Replies:
    17
    Views:
    561
    Paul Rubin
    Jul 11, 2003
  2. Aki Niimura

    freeze utility and pickle

    Aki Niimura, Aug 21, 2003, in forum: Python
    Replies:
    1
    Views:
    520
    =?ISO-8859-1?Q?Gerhard_H=E4ring?=
    Aug 21, 2003
  3. Bram Stolk
    Replies:
    2
    Views:
    337
    Peter Otten
    Sep 23, 2003
  4. Gonçalo Rodrigues

    Pickle question

    Gonçalo Rodrigues, Oct 10, 2003, in forum: Python
    Replies:
    0
    Views:
    328
    Gonçalo Rodrigues
    Oct 10, 2003
  5. Michele Simionato
    Replies:
    2
    Views:
    1,858
    Michele Simionato
    May 23, 2008
Loading...

Share This Page