I’d like to convert a list of record arrays — dtype is (uint32, float32) — into a numpy array of dtype np.object:
X = np.array(instances, dtype = np.object)
where instances is a list of arrays with data type np.dtype([('f0', '<u4'), ('f1', '<f4')]).
However, the above statement results in an array whose elements are also of type np.object:
X[0]
array([(67111L, 1.0), (104242L, 1.0)], dtype=object)
Does anybody know why?
The following statement should be equivalent to the above but gives the desired result:
X = np.empty((len(instances),), dtype = np.object)
X[:] = instances
X[0]
array([(67111L, 1.0), (104242L, 1.0), dtype=[('f0', '<u4'), ('f1', '<f4')])
thanks & best regards,
peter
Stéfan van der Walt (a numpy developer) explains:
When you say something like
np.arrayis forced to guess what is the dimension of the array you desire.instancesis a list of two objects, each of length 2. So, quite reasonably,np.arrayguesses thatYshould have shape (2,2):In most cases, I think that is what would be desired. However,
in your case, since this is not what you desire, you must construct the array explicitly:
Now there is no question about X’s shape:
(2, )and so when you feed in the datanumpy is smart enough to regard
instancesas a sequence of two objects.