I’m writing an application where Tags are linkable and there’s a need to retrieve the entire chain of linked Tags. Self-reference is not allowed. Running the following code ends up with some very strange results:
class Tag(object):
def __init__(self, name):
self.name = name
self.links = []
def __repr__(self):
return "<Tag {0}>".format(self.name)
def link(self, tag):
self.links.append(tag)
def tag_chain(tag, known=[]):
chain = []
if tag not in known:
known.append(tag)
print "Known: {0}".format(known)
for link in tag.links:
if link in known:
continue
else:
known.append(link)
chain.append(link)
chain.extend(tag_chain(link, known))
return chain
a = Tag("a")
b = Tag("b")
c = Tag("c")
a.link(b)
b.link(c)
c.link(a)
o = tag_chain(a)
print "Result:", o
print "------------------"
o = tag_chain(a)
print "Result:", o
Results:
Known: [<Tag a>]
Known: [<Tag a>, <Tag b>]
Known: [<Tag a>, <Tag b>, <Tag c>]
Result: [<Tag b>, <Tag c>]
------------------
Known: [<Tag a>, <Tag b>, <Tag c>]
Result: []
So, somehow, I’ve accidentally created a closure. As far as I can see, known should have gone out of scope and died off once the function call completed.
If I change the definition of chain_tags() to not set a default value, the problem goes away:
...
def tag_chain(tag, known):
...
o = tag_chain(a, [])
print "Result:", o
print "------------------"
o = tag_chain(a, [])
print "Result:", o
Why is this?
This is a common mistake in Python:
known=[]doesn’t mean that if known is unsupplied, make it an empty list; in fact, it binds known to an “anonymous” list. Each time that known defaults to that list, it is the same list.The typical pattern to do what you intended here, is:
which correctly initializes
knownto an empty list if it is not provided.