In Python one can very easily check if a value is contained in a container by using the in-operator. I was wondering why anyone would ever use the in-operator on a list, though, when it’s much more efficient to first transform the list to a set as such:
if x in [1,2,3]:
as opposed to
if x in set([1,2,3]):
When looking at the time complexity, the first one has O(n) while the second one is superior at O(1). Is the only reason to use the first one the fact that it’s more readable and shorter to write? Or is there a special case in which it’s more practical to use? Why did the Python devs not implement the first one by first translating it to the second one? Would this not grand both of them the O(1) complexity?
is not faster than
Converting a list to a set requires iterating over the list, and is thus at least
O(n)time.* In practice it takes a lot longer than searching for an item, since it involves hashing and then inserting every item.Using a set is efficient when the set is converted once and then checked multiple times. Indeed, trying this by searching for
500in the listrange(1000)indicates that the tradeoff occurs once you are checking at least 3 times:gives me:
Tests with list sizes ranging from 500 to 50000 give roughly the same result.
* Indeed, in the true asymptotic sense inserting into a hash table (and, for that matter, checking a value) is not
O(1)time, but rather a constant speedup of linearO(n)time (since if the list gets too large collisions will build up). That would make theset([1,2,3])operation be inO(n^2)time rather thanO(n). However, in practice, with reasonable sized lists with a good implementation, you can basically always assume insertion and lookup of a hash table to beO(1)operations.