I am having some seemingly trivial trouble with numpy when the array contains string data. I have the following code:
my_array = numpy.empty([1, 2], dtype = str)
my_array[0, 0] = "Cat"
my_array[0, 1] = "Apple"
Now, when I print it with print my_array[0, :], the response I get is ['C', 'A'], which is clearly not the expected output of Cat and Apple. Why is that, and how can I get the right output?
Thanks!
Numpy requires string arrays to have a fixed maximum length. When you create an empty array with
dtype=str, it sets this maximum length to 1 by default. You can see if you domy_array.dtype; it will show “|S1”, meaning “one-character string”. Subsequent assignments into the array are truncated to fit this structure.You can pass an explicit datatype with your maximum length by doing, e.g.:
The “S10” will create an array of length-10 strings. You have to decide how big will be big enough to hold all the data you want to hold.