I am new to Python, and don’t understand what .dtype does.
For example:
>>> aa
array([1, 2, 3, 4, 5, 6, 7, 8])
>>> aa.dtype = "float64"
>>> aa
array([ 4.24399158e-314, 8.48798317e-314, 1.27319747e-313,
1.69759663e-313])
I thought dtype is a property of aa, which should be int, and if I assign aa.dtype = "float64"
thenaa should become array([1.0 ,2.0 ,3.0, 4.0, 5.0, 6.0, 7.0, 8.0]).
Why does it changes its value and size?
What does it mean?
I was actually learning from a piece of code, and shall I paste it here:
def to_1d(array):
"""prepares an array into a 1d real vector"""
a = array.copy() # copy the array, to avoid changing global
orig_dtype = a.dtype
a.dtype = "float64" # this doubles the size of array
orig_shape = a.shape
return a.ravel(), (orig_dtype, orig_shape) #flatten and return
I think it shouldn’t change the value of the input array but only change its size. Confused of how the function works
First off, the code you’re learning from is flawed. It almost certainly doesn’t do what the original author thought it did based on the comments in the code.
What the author probably meant was this:
However, if
arrayis always going to be an array of complex numbers, then the original code makes some sense.The only cases where viewing the array (
a.dtype = 'float64'is equivalent to doinga = a.view('float64')) would double its size is if it’s a complex array (numpy.complex128) or a 128-bit floating point array. For any other dtype, it doesn’t make much sense.For the specific case of a complex array, the original code would convert something like
np.array([0.5+1j, 9.0+1.33j])intonp.array([0.5, 1.0, 9.0, 1.33]).A cleaner way to write that would be:
(I’m ignoring the part about returning the original dtype and shape, for the moment.)
Background on numpy arrays
To explain what’s going on here, you need to understand a bit about what numpy arrays are.
A numpy array consists of a “raw” memory buffer that is interpreted as an array through “views”. You can think of all numpy arrays as views.
Views, in the numpy sense, are just a different way of slicing and dicing the same memory buffer without making a copy.
A view has a shape, a data type (dtype), an offset, and strides. Where possible, indexing/reshaping operations on a numpy array will just return a view of the original memory buffer.
This means that things like
y = x.Tory = x[::2]don’t use any extra memory, and don’t make copies ofx.So, if we have an array similar to this:
We could reshape it by doing either:
or
For readability, the first option is better. They’re (almost) exactly equivalent, though. Neither one will make a copy that will use up more memory (the first will result in a new python object, but that’s beside the point, at the moment.).
Dtypes and views
The same thing applies to the dtype. We can view an array as a different dtype by either setting
x.dtypeor by callingx.view(...).So we can do things like this:
Which yields:
Keep in mind, though, that this is giving you low-level control over the way that the memory buffer is interpreted.
For example:
This yields:
So, we’re interpreting the underlying bits of the original memory buffer as floats, in this case.
If we wanted to make a new copy with the ints recasted as floats, we’d use x.astype(np.float).
Complex Numbers
Complex numbers are stored (in both C, python, and numpy) as two floats. The first is the real part and the second is the imaginary part.
So, if we do:
We can see the real (
x.real) and imaginary (x.imag) parts. If we convert this to a float, we’ll get a warning about discarding the imaginary part, and we’ll get an array with just the real part.astypemakes a copy and converts the values to the new type.However, if we view this array as a float, we’ll get a sequence of
item1.real, item1.imag, item2.real, item2.imag, ....yields:
Each complex number is essentially two floats, so if we change how numpy interprets the underlying memory buffer, we get an array of twice the length.
Hopefully that helps clear things up a bit…