I’ve got a function which parses a sentence by building up a chart. But Python holds on to whatever memory was allocated during that function call. That is, I do
best = translate(sentence, grammar)
and somehow my memory goes up and stays up. Here is the function:
from string import join
from heapq import nsmallest, heappush
from collections import defaultdict
MAX_TRANSLATIONS=4 # or choose something else
def translate(f, g):
words = f.split()
chart = {}
for col in range(len(words)):
for row in reversed(range(0,col+1)):
# get rules for this subspan
rules = g[join(words[row:col+1], ' ')]
# ensure there's at least one rule on the diagonal
if not rules and row==col:
rules=[(0.0, join(words[row:col+1]))]
# pick up rules below & to the left
for k in range(row,col):
if (row,k) and (k+1,col) in chart:
for (w1, e1) in chart[row, k]:
for (w2, e2) in chart[k+1,col]:
heappush(rules, (w1+w2, e1+' '+e2))
# add all rules to chart
chart[row,col] = nsmallest(MAX_TRANSLATIONS, rules)
(w, best) = chart[0, len(words)-1][0]
return best
g = defaultdict(list)
g['cela'] = [(8.28, 'this'), (11.21, 'it'), (11.57, 'that'), (15.26, 'this ,')]
g['est'] = [(2.69, 'is'), (10.21, 'is ,'), (11.15, 'has'), (11.28, ', is')]
g['difficile'] = [(2.01, 'difficult'), (10.08, 'hard'), (10.19, 'difficult ,'), (10.57, 'a difficult')]
sentence = "cela est difficile"
best = translate(sentence, g)
I’m using Python 2.7 on OS X.
Within the function, you set
rulesto an element ofgrammar;rulesthen refers to that element, which is a list. You then add items toruleswithheappush, which (as lists are mutable) meansgrammarholds on to the pushed values via that list. If you don’t want this to happen, usecopywhen assigningrulesordeepcopyon the grammar at the start oftranslate. Note that even if you copy the list torules, the grammar will record an empty list every time you retrieve an element for a missing key.