I’m wondering because I need to have have a function that is disgustingly fast at checking if a word is in a dictionary list – I’m considering leaving the dictionary as a large string and running regex against instead. This needs to be absurdly fast. So I just need a basic overview of how python handles checking if a string is in a list of strings and if its beyond-reasonable fast.
Share
If you want a blazingly fast membership test, then a list is the wrong data structure. Take a look at the implementation of
list_containsinlistobject.c, line 437. It iterates over the list in order, comparing the item with each element in turn. The later the item appears in the list, the longer it will take to find it, and if the item is missing, then the whole list must be scanned.Use a set instead. Sets are implemented internally by a hash table, so looking up an object involves computing its hash and then scanning a few table entries (usually just one). For the particular case of looking up a string, see
set_lookkey_stringinsetobject.c, line 156.