I’m reading in a file and pulling in data that includes some strings and some numbers, in Python. I’m storing this information as lists of lists, like this:
dataList = [
['blah', 2, 3, 4],
['blahs', 6, 7, 8],
['blaher', 10, 11, 12],
]
I want to keep dataList sorted by the second element of the sub list: dataList[][1]
I thought I could use insort or bisect right when I want to add them in, but I cannot figure out how to make it look at the second element of the sub list.
Any thoughts here? I was just appending data to the end and then doing a linear sort to find things back later on. But, throw a few 10’s of thousands of sub-lists in here and then search for 100k items and it takes a while.
This sorts the list in place, by the second element in each item.
As has been pointed out in the comments, it is much more efficient to sort just once (at the end). Python’s built-in sort method has been heavily optimised to work fast. After testing it looks like the built-in sort is consistently around 3.7 times faster than using the heap method suggested in the other answer, over various size lists (I tested sizes of up to 600000).