If I try to change some values in an duplicated array, the original array gets mysteriously also affected.
import numpy as np
x = np.zeros((3, 10))
y = x
print(x)
print(y, "\n")
y[1:3, 4:8] = 1
print(x)
print(y)
The output on my system is as follows:
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 0. 0.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 0. 0.]]
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 0. 0.]
[ 0. 0. 0. 0. 1. 1. 1. 1. 0. 0.]]
I’m currently using NumPy 1.6.2 as a 64bit version compiled against the Intel MKL (from Christoph Gohlke) together with Python 3.2.3.
I also tried the 32bit “official” version but got exactly the same results…
You did not create a copy of the array. You simply created a new reference to the same array. If you want to copy the array use
numpy.copy:Note that with regular python lists you can obtain a shallow copy with the syntax
the_list[:](where[:]means create a slice that start at the beginning and ends at the end of the original list), whilenumpyslices are actually views in most cases:Versus:
In python every identifier is a reference to an object. When you do an assignment:
The assignment binds the name
yto the object referenced byx. In other wordsybecomes a copy of the reference ofx. Now the objected referenced byxhas one more reference. When an object doesn’t have anymore references it is deleted from memory.Python hasn’t got the concept of “variables” as seen in C.
An other way of thinking is that any identifier is a pointer to an object. Thus
y = xcopies the pointer and not the object itself(and this is what actually happens. The C/API always uses pointers toPyObjectstructures)This is explained in every python tutorial, hence I’d suggest you to put aside
numpyfor a bit and read a python tutorial first.I’ve decided to illustrate what happens with the bind using some Unicode art.
The idea is not mine, but of user Claudio_F from the python-it.org forum.
The idea is that in python you have identifiers, which I’ll represent as buoys. Each buoy can be connected to an object, which is represented as a rock(or whatever) under the sea.
An object can be bound to more buoys. When an object is not bound to any buoy it sinks in the sea and when it reaches the bottom of the sea a huge sea monster destroys it (a.k.a the garbage collector).
Now when you do the assignment:
This is the situation:
When you do the assignment:
Both buoys are bound to the same object:
When you do the third assignment:
The
achain is untied from the5object and bound to a new,7, object butb‘s buoy remains bound to5:I hope you enjoyed the pictures and that they made clear how it works.
(The jellyfish and the dead fish are there just because I thought they were cool)
By the way, this representation is also awesome to understand reference between objects.
For example the code:
Would be represented like this:
Note how the list has “pointers” to the objects, and not the actual objects.
Also, going back to slicing, if you do:
You obtain this scenario:
Note that both lists point to the same objects. That’s why it’s called shallow copy.
If, instead of integers, you put some mutable object, like an other list, you can clearly understand why modifying it will change the contents of both lists.