T
temposs
I'm trying to pickle a class, and while I get no errors or anything,
almost none of the class instance gets pickled, and I don't know
why...Here's the pickled output:
(i__main__
TrainingMatrix
p0
(dp1
S'matrixWords'
p2
I4714
sS'numWords'
p3
I4714
sS'totalWordsProcessed'
p4
I46735
sS'numContexts'
p5
I7664
sS'estimator'
p6
(dp7
sb.
--End of output
The class TrainingMatrix has no embedded classes and none of its
methods have embedded methods. An instance of this class running in my
program for about 10 minutes can build up on the order of 100MB in
resident memory, but the output seems to be the same regardless of the
data set size. The output seems to not even capture all of the member
variables in the class. Here is the class code, abridged:
class TrainingMatrix:
matrix = []
estimator = {}
wordInfo = {}
contextInfo = {}
totalWordsProcessed = 0
numWords = 0
numContexts = 0
matrixWords = 0
def AddWordInfo(self,newWordInfo,newCapScheme):
...
#End AddWordInfo
def AddNewWord(self,newCapScheme):
...
#End AddNewCapScheme
def AddContext(self,newContext):
...
#End AddContext
def AddInstance(self,word,context):
...
#End AddInstance
def UpdateMatrix(self,wordIndex,contextIndex,isLowerCase):
...
#End UpdateMatrix
def PrintMatrix(self):
...
#End PrintMatrix
def EstimateLowerCase(self):
...
#End GetNumWords
def GetWordInfo(self,wordToFind):
...
#End GetWordInfo
def GetContext(self,wordList,direction):
...
#End GetContext
def GetBestCapScheme(self,wordInfo,precedeContext,followContext):
...
#End GetBestCapScheme
def IsLowerCase(self,word):
...
#End IsLowerCase
#End TrainingMatrix
###################
And here is the pickling code:
try:
trainDB = open(trainDBString,"r+")
except IOError:
trainDB = open(trainDBString,"w")
trainDB.close()
trainDB = open(trainDBString,"r+")
#End try
....
try:
trainerString = trainDB.read()
trainer = loads(trainerString)
except EOFError:
trainer = TrainingMatrix()
#End try
....
trainerString = dumps(trainer)
trainDB.write(trainerString)
I've also tried a simple shelve implementation but got results similar
to this, which is why I recoded to pickle, since it's lower level. Any
help is appreciated
-Andrew
almost none of the class instance gets pickled, and I don't know
why...Here's the pickled output:
(i__main__
TrainingMatrix
p0
(dp1
S'matrixWords'
p2
I4714
sS'numWords'
p3
I4714
sS'totalWordsProcessed'
p4
I46735
sS'numContexts'
p5
I7664
sS'estimator'
p6
(dp7
sb.
--End of output
The class TrainingMatrix has no embedded classes and none of its
methods have embedded methods. An instance of this class running in my
program for about 10 minutes can build up on the order of 100MB in
resident memory, but the output seems to be the same regardless of the
data set size. The output seems to not even capture all of the member
variables in the class. Here is the class code, abridged:
class TrainingMatrix:
matrix = []
estimator = {}
wordInfo = {}
contextInfo = {}
totalWordsProcessed = 0
numWords = 0
numContexts = 0
matrixWords = 0
def AddWordInfo(self,newWordInfo,newCapScheme):
...
#End AddWordInfo
def AddNewWord(self,newCapScheme):
...
#End AddNewCapScheme
def AddContext(self,newContext):
...
#End AddContext
def AddInstance(self,word,context):
...
#End AddInstance
def UpdateMatrix(self,wordIndex,contextIndex,isLowerCase):
...
#End UpdateMatrix
def PrintMatrix(self):
...
#End PrintMatrix
def EstimateLowerCase(self):
...
#End GetNumWords
def GetWordInfo(self,wordToFind):
...
#End GetWordInfo
def GetContext(self,wordList,direction):
...
#End GetContext
def GetBestCapScheme(self,wordInfo,precedeContext,followContext):
...
#End GetBestCapScheme
def IsLowerCase(self,word):
...
#End IsLowerCase
#End TrainingMatrix
###################
And here is the pickling code:
try:
trainDB = open(trainDBString,"r+")
except IOError:
trainDB = open(trainDBString,"w")
trainDB.close()
trainDB = open(trainDBString,"r+")
#End try
....
try:
trainerString = trainDB.read()
trainer = loads(trainerString)
except EOFError:
trainer = TrainingMatrix()
#End try
....
trainerString = dumps(trainer)
trainDB.write(trainerString)
I've also tried a simple shelve implementation but got results similar
to this, which is why I recoded to pickle, since it's lower level. Any
help is appreciated
-Andrew