I’m working on a project in Python requiring a lot of numerical array calculations. Unfortunately (or fortunately, depending on your POV), I’m very new to Python, but have been doing MATLAB and Octave programming (APL before that) for years. I’m very used to having every variable automatically typed to a matrix float, and still getting used to checking input types.
In many of my functions, I require the input S to be a numpy.ndarray of size (n,p), so I have to both test that type(S) is numpy.ndarray and get the values (n,p) = numpy.shape(S). One potential problem is that the input could be a list/tuple/int/etc…, another problem is that the input could be an array of shape (): S.ndim = 0. It occurred to me that I could simultaneously test the variable type, fix the S.ndim = 0problem, then get my dimensions like this:
# first simultaneously test for ndarray and get proper dimensions
try:
if (S.ndim == 0):
S = S.copy(); S.shape = (1,1);
# define dimensions p, and p2
(p,p2) = numpy.shape(S);
except AttributeError: # got here because input is not something array-like
raise AttributeError("blah blah blah");
Though it works, I’m wondering if this is a valid thing to do? The docstring for ndim says
If it is not already an ndarray, a conversion is
attempted.
and we surely know that numpy can easily convert an int/tuple/list to an array, so I’m confused why an AttributeError is being raised for these types inputs, when numpy should be doing this
numpy.array(S).ndim;
which should work.
Given the comments to @larsmans answer, you could try:
First, you check explicitly whether
Sis a (subclass of)ndarray. Then, you use thenp.reshapeto copy your data (and reshaping it, of course) if needed. At last, you get the dimension.Note that in most cases, the
npfunctions will first try to access the corresponding method of andarray, then attempt to convert the input to andarray(sometimes keeping it a subclass, as innp.asanyarray, sometimes not (as innp.asarray(...)). In other terms, it’s always more efficient to use the method rather than the function: that’s why we’re usingS.shapeand notnp.shape(S).Another point: the
np.asarray,np.asanyarray,np.atleast_1D… are all particular cases of the more generic functionnp.array. For example,asarraysets the optionalcopyargument ofarraytoFalse,asanyarraydoes the same and setssubok=True,atleast_1Dsetsndmin=1,atleast_2dsetsndmin=2… In other terms, it’s always easier to usenp.arraywith the appropriate arguments. But as mentioned in some comments, it’s a matter of style. Shortcuts can often improve readability, which is always an objective to keep in mind.In any case, when you use
np.array(..., copy=True), you’re explicitly asking for a copy of your initial data, a bit like doing alist([....]). Even if nothing else changed, your data will be copied. That has the advantages of its drawbacks (as we say in French), you could for example change theorderfrom row-firstCto column-firstF. But anyway, you get the copy you wanted.With
np.array(input, copy=False), a new array is always created. It will either point to the same block of memory asinputif this latter was already andarray(that is, no waste of memory), or will create a new one “from scratch” ifinputwasn’t. The interesting case is of course ifinputwas andarray.Using this new array in a function may or may not change the original input, depending on the function. You have to check the documentation of the function you want to use to see whether it returns a copy or not. The NumPy developers try hard to limit unnecessary copies (following the Python example), but sometimes it can’t be avoided. The documentation should tell explicitly what happens, if it doesn’t or it’s unclear, please mention it.
np.array(...)may raise some exceptions if something goes awry. For example, trying to use adtype=floatwith an input like["STRING", 1]will raise aValueError. However, I must admit I can’t remember which exceptions in all the cases, please edit this post accordingly.