Possible Duplicate:
Python “is” operator behaves unexpectedly with integers
I stumbled upon the following Python weirdity:
>>> two = 2
>>> ii = 2
>>> id(two) == id(ii)
True
>>> [id(i) for i in [42,42,42,42]]
[10084276, 10084276, 10084276, 10084276]
>>> help(id)
Help on built-in function id in module __builtin__:
id(...)
id(object) -> integer
Return the identity of an object. This is guaranteed to be unique among
simultaneously existing objects. (Hint: it's the object's memory address.)
- Is every number a unique object?
- Are different variables holding the same elemental values (for example, two,ii) the same object?
- How is the id of a number generated by Python?
- In the above example, are two and ii pointers to a memory cell holding the value 2? That would be extremely weird.
Help me untangle this identity crisis.
Some more weirdities:
>>> a,b=id(0),id(1)
>>> for i in range(2,1000):
a,b=b,id(i)
if abs(a-b) != 12:
print('%i:%i -> %i' % (i,a,b))
The above code examines if ids of consecutive integers are also consecutive, and prints out
anomalies:
77:10083868 -> 10085840
159:10084868 -> 10086840
241:10085868 -> 10087840
257:10087660 -> 11689620
258:11689620 -> 11689512
259:11689512 -> 11689692
260:11689692 -> 11689548
261:11689548 -> 11689644
262:11689644 -> 11689572
263:11689572 -> 11689536
264:11689536 -> 11689560
265:11689560 -> 11689596
266:11689596 -> 11689656
267:11689656 -> 11689608
268:11689608 -> 11689500
331:11688756 -> 13807288
413:13806316 -> 13814224
495:13813252 -> 13815224
577:13814252 -> 13816224
659:13815252 -> 13817224
741:13816252 -> 13818224
823:13817252 -> 13819224
905:13818252 -> 13820224
987:13819252 -> 13821224
Note that a pattern emerges from 413 onwards. Maybe it’s due to some voodoo accounting at the beginning of each new memory page.
Your fourth question, “in the above example, are two and ii pointers to a memory cell holding the value 2? that would be extremely weird”, is really the key to understanding the whole thing.
If you’re familiar with languages like C, Python “variables” don’t really work the same way. A C variable declaration like:
says, “compiler, reserve for me two areas of memory, on the stack, each with enough space to hold an integer, and remember one as ‘j’ and the other as ‘k’. Then fill j with the value ‘1’ and k with the value ‘2’.” At runtime, the code says “take the integer contents of k, add the integer contents of j, and store the result back to k.”
The seemingly equivalent code in Python:
says something different: “Python, look up the object known as ‘1’, and create a label called ‘j’ that points to it. Look up the object known as ‘2’, and create a label called ‘k’ that points to it. Now look up the object ‘k’ points to (‘2’), look up the object ‘j’ points to (‘1’), and point ‘k’ to the object resulting from performing the ‘add’ operation on the two.”
Disassembling this code (with the dis module) shows this nicely:
So yes, Python “variables” are labels that point to objects, rather than containers that can be filled with data.
The other three questions are all variations on “when does Python create a new object from a piece of code, and when does it reuse one it already has?”. The latter is called “interning”; it happens to smaller integers and strings that look (to Python) like they might be symbol names.