Possible Duplicate:
Flatten (an irregular) list of lists in Python
I’m trying to use the nltk library in python, and more specifically the wordnet corpus, to extract all the words in a broad semantic category like ‘animal’. I’ve managed to write a function that goes down through all the categories and extracts the words in them, but what I end up with is a huge jumble of lists within lists. The lists aren’t of any predictable length or depth, they look like this:
['pet', 'pest', 'mate', 'young', 'stunt', 'giant', ['hen', 'dam', 'filly'], ['head', 'stray', 'dog', ['puppy', 'toy', 'spitz', 'pooch', 'doggy', 'cur', 'mutt', 'pug', 'corgi', ['Peke'], ['chow'], ['feist', 'fice'], ['hound', ['Lhasa', 'cairn']], ['boxer', 'husky']], ['tabby', 'tabby', 'queen', 'Manx', 'tom', 'kitty', 'puss', 'pussy', ['gib']]]
What I want is to be able to grab each of those strings out of that , and return a single, unnested list. Any advice?
In general, when you have to deal with arbitrary levels of nesting, a recursive solution is a good fit. Lists within lists, parsing HTML (tags within tags), working with filesystems (directories within directories), etc.
I haven’t tested this code extensively, but I believe it should do what you want:
In general recursion is very easy to think about and the solutions tend to be very elegant (like above) but for really, really deeply nested things – think thousands of levels deep – you can run into problems like stack overflow.
Generally this isn’t a problem, but I believe a recursive function can always* be converted to a loop (it just doesn’t look as nice.)