I was checking out Peter Norvig’s code on how to write simple spell checkers. At the beginning, he uses this code to insert words into a dictionary.
def train(features):
model = collections.defaultdict(lambda: 1)
for f in features:
model[f] += 1
return model
What is the difference between a Python dict and the one that was used here? In addition, what is the lambda for? I checked the API documentation here and it says that defaultdict is actually derived from dict but how does one decide which one to use?
The difference is that a
defaultdictwill “default” a value if that key has not been set yet. If you didn’t use adefaultdictyou’d have to check to see if that key exists, and if it doesn’t, set it to what you want.The lambda is defining a factory for the default value. That function gets called whenever it needs a default value. You could hypothetically have a more complicated default function.
(from
help(type(collections.defaultdict()))){}.setdefaultis similar in nature, but takes in a value instead of a factory function. It’s used to set the value if it doesn’t already exist… which is a bit different, though.