On this page, I see something interesting:
Note that there is a fast-path for dicts that (in practice) only deal with str keys; this doesn’t affect the algorithmic complexity, but it can significantly affect the constant factors: how quickly a typical program finishes.
So what does it exactly mean?
Does it mean using string as the key is always faster?
If yes, why?
Update:
Thanks for the suggestions about optimization! But I’m actually more interested in the plain truth, than whether or when we should do optimization.
Update 2:
Thanks for the great answers, I’ll cite the content from the link provided by @DaveWebb here:
”
…
ma_lookup is initially set to the lookdict_string function (renamed to lookdict_unicode in 3.0), which assumes that both the keys in the dictionary and the key being searched for are standard PyStringObject’s. It is then able to make a couple of optimiziations, such as mitigating various error checks, since string-to-string comparison never raise exceptions. There is also no need for rich object comparisons either, which means we avoid calling PyObject_RichCompareBool, and always use _PyString_Eq directly.
…
“
Also, for the experiment numbers, I think the size of the difference will be even bigger if there is no int-to-string conversion
The C code that underlies the Python dict is optimisted for String keys. You can read about this here (and in the book the blog refers to).
If the Python runtime knows your dict only contains string keys it can do things such as not cater for errors that won’t happen with a string to string comparison and ignore the rich comparison operators. This will make the common case of the string key only
dicta little faster. (Update: timing shows it to be more than a little.)However, it is unlikely that this would make a significant change to the run time of most Python programs. Only worry about this optimisation if you have measured and found
dictlookups to be a bottleneck in your code. As the famous quote says, “Premature optimization is the root of all evil.”The only way to see how much faster things really are, is to time them:
So using string keys is about 30% faster even compared to
intkeys, and I have to admit I was surprised at the size of the difference.