I have data like
[2, 2, 2, 2, 2, 3, 13, 113]
which I then want to sort into separate lists by keys generated by myself. In fact I want to generate all possible lists.
Some examples:
values: [2, 2, 2, 2, 2, 3, 13, 113]
keys: [0, 0, 1, 2, 1, 3, 3, 1]
sublists: [2, 2], [2, 2, 113], [2], [3, 13]
values: [2, 2, 2, 2, 2, 3, 13, 113]
keys: [0, 1, 0, 0, 0, 1, 1, 0]
sublists: [2, 2, 2, 2, 113], [2, 3, 13]
values: [2, 2, 2, 2, 2, 3, 13, 113]
keys: [2, 3, 0, 0, 4, 4, 1, 3]
sublists: [2, 2], [13], [2], [2, 113], [2, 3]
All possible keys are generated by
def generate_keys(prime_factors):
key_size = len(prime_factors) - 1
key_values = [str(i) for i in range(key_size)]
return list(itertools.combinations_with_replacement(key_values, \
len(prime_factors)))
Then I thought I could use the keys to shift the values into the sublists. That’s the part I’m stuck on. I thought itertools.groupby would be my solution but upon further investigation I see no way to use my custom lists as keys for groupby.
How do I split my big list into smaller sublists using these keys? There may even be a way to do this without using keys. Either way, I don’t know how to do it and looking at other Stack Overflow questions has eben in the ballpark but not exactly this question.
This does what you want:
Explanation:
collections.defaultdict is a handy
dict-like class that lets you define what should happen in the event that a key doesn’t exist in the dictionary that you’re trying to manipulate. For example, in my code, I haveanswer[k].append(v). We know thatappendis alistfunction, so we know thatanswer[k]should be a list. However, if I was using a conventionaldictand I tried toappendto the value of a non-existent key, I would have gotten aKeyErroras follows:This was only made possible because I defined
answer = collections.defaultdict(list). If I had definedanswer = collections.defaultdict(int), I would gotten a different error – one that would tell me thatintobjects don’t have anappendmethod.zipon the other hand takes twolists (well actually, it takes at least twoiterables), lets call themlist1andlist2and returns a list of tuples in which theith tuple contains two objects. The first islist1[i]and the second islist2[i]. Iflist1andlist2are of unequal length,len(zip(list1, list2))would be the smaller value amonglen(list1)andlen(list2)(i.e.min(len(list1), len(list2)).Once I’ve zipped
keysandvalues, I want to create a dict such that maps a value fromkeysto a list of values fromvalues. This is why I used adefaultdict, so that I wouldn’t have to check for the existence of a key in it before I appended to its value. If I had used a conventional dict, I would have had to do this:Now that you have a
dict(or adict-like object) that maps values fromkeysto lists ofints that share the same key, all you need to do is get the lists which are the values ofanswerin sorted order, sorted by the keys ofanswer.sorted(answer)gives me a list of all ofanswers keys in sorted order.Once I have this list of sorted keys, all I have to do is get their values, which are lists of ints, and put all those lists into one big list and return that big list.
… annnnnd Done! Hope that helps