While working through the awesome book “Programming Collective Intelligence”, by Toby Segaran, I’ve encountered some techniques in index assignments I’m not entirely familiar with.
Take this for example:
createkey='_'.join(sorted([str(wi) for wi in wordids]))
or:
normalizedscores = dict([(u,float(l)/maxscore) for (u,l) in linkscores.items()])
All the nested tuples in the indexes have me a bit confused. What is actually being assigned to these varibles? I assumed obviously the .join one comes out as a string, but what about the latter? If someone could explain the mechanics of these loops I’d really appreciate it. I assume these are pretty common techniques, but being new to Python, I suppose to ask is a moment’s shame. Thanks!
is a list comprehension.
is the same as
So
creates a list of strings from each item in
wordids, then sorts that list and joins it into a big string using_as a separator.As agf rightly noted, you can also use a generator expression, which looks just like a list comprehension but with parentheses instead of brackets. This avoids construction of a list if you don’t need it later (except for iterating over it). And if you already have parentheses there like in this case with
sorted(...)you can simply remove the brackets.However, in this special case you won’t be getting a performance benefit (in fact, it’ll be about 10 % slower; I timed it) because
sorted()will need to build a list anyway, but it looks a bit nicer:iterates through the items of the dictionary
linkscores, where each item is a key/value pair. It creates a list of key/l/maxscoretuples and then turns that list back into a dictionary.However, since Python 2.7, you could also use dict comprehensions:
Here’s some timing data:
Python 3.2.2
Python 2.7.2