Please explain collections.defaultdict(lambda: 1)

M

metaperl.com

I'm reading http://norvig.com/spell-correct.html

and do not understand the expression listed in the subject which is
part of this function:

def train(features):
model = collections.defaultdict(lambda: 1)
for f in features:
model[f] += 1
return model


Per http://docs.python.org/lib/defaultdict-examples.html

It seems that there is a default factory which initializes each key to
1. So by the end of train(), each member of the dictionary model will
have value >= 1

But why wouldnt he set the value to zero and then increment it each
time a "feature" (actually a word) is encountered? It seems that each
model value would be 1 more than it should be.
 
P

Paul McGuire

I'm readinghttp://norvig.com/spell-correct.html

and do not understand the expression listed in the subject which is
part of this function:

def train(features):
model = collections.defaultdict(lambda: 1)
for f in features:
model[f] += 1
return model

Perhttp://docs.python.org/lib/defaultdict-examples.html

It seems that there is a default factory which initializes each key to
1. So by the end of train(), each member of the dictionary model will
have value >= 1

But why wouldnt he set the value to zero and then increment it each
time a "feature" (actually a word) is encountered? It seems that each
model value would be 1 more than it should be.

The explanation is a little further down on that same page, on the
discussion of "novel" words and avoiding the probablity of them being
0 just because they have not yet been seen in the training text.

-- Paul
 
D

Duncan Booth

metaperl.com said:
Per http://docs.python.org/lib/defaultdict-examples.html

It seems that there is a default factory which initializes each key to
1. So by the end of train(), each member of the dictionary model will
have value >= 1

But why wouldnt he set the value to zero and then increment it each
time a "feature" (actually a word) is encountered? It seems that each
model value would be 1 more than it should be.
The author explains his reasoning in the article: he wants to treat novel
words (i.e. those which did not appear in the training corpus) as having
been seen once.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top