In Python (2.7.2),why does
import dis
dis.dis("i in (2, 3)")
works as expected whereas
import dis
dis.dis("i in [2, 3]")
raises:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dis.py", line 45, in dis
disassemble_string(x)
File "/usr/lib/python2.7/dis.py", line 112, in disassemble_string
labels = findlabels(code)
File "/usr/lib/python2.7/dis.py", line 166, in findlabels
oparg = ord(code[i]) + ord(code[i+1])*256
IndexError: string index out of range
Note that this doesn’t affect Python3.
Short Answer
In Python 2.x, the
strtype holds raw bytes, sodisassumes that if you pass it a string it is getting compiled bytecode. It tries to disassemble the string you pass it as bytecode and — purely due to the implementation details of Python bytecode — succeeds fori in (2,3). Obviously, though, it returns gibberish.In Python 3.x, the
strtype is for strings and thebytestypes is for raw bytes, sodiscan distinguish between compiled bytecode and strings — and assumes it is getting source code if it gets a string.Long Answer
Here’s the thought process I followed to work this one out.
I tried it on my Python (3.2):
Obviously, this works.
I tried it on Python 2.7:
Aha! Notice also that the generated bytecode in Python 3.2 is what you would expect (“load
i, load(2,3), test for membership, return the result”) whereas what you have got in Python 2.7 is gibberish. Clearly,disis decompiling the string as bytecode in 2.7 but compiling it as Python in 3.2.I had a look in the source code for
dis.dis. Here are the key points:Python 2.7:
Python 3.2:
Just for fun, let’s check this by passing the same bytes to
disin Python 3:Aha! Gibberish! (Though note that it’s slightly different gibberish — the bytecode has changed with the Python version!)