I’m just starting with the scipy stack. I’m using the iris dataset, in a CSV version. I can load it just fine using:
iris=numpy.recfromcsv("iris.csv")
and plot it:
pylab.scatter(iris.field(0), iris.field(1))
pylab.show()
Now I’d like to also plot the classes, which are stored in iris.field(4):
chararray(['setosa', ...], dtype='|S10')
What is an elegant way to map these strings to colors for plotting? scatter(iris.field(0), iris.field(1), c=iris.field(4)) does not work (from the docs it expect float values or a colormap). I havn’t found an elegant way of automatically generating a color map.
cols = {"versicolor": "blue", "virginica": "green", "setosa": "red"}
scatter(iris.field(0), iris.field(1), c=map(lambda x:cols[x], iris.field(4)))
does approximately what I want, but I don’t like the manual color specification too much.
Edit: slightly more elegant version of the last line:
scatter(iris.field(0), iris.field(1), c=map(cols.get, iris.field(4)))
For whatever it’s worth, you’d typically do something more like this in that case:
There’s nothing wrong with what @Yann suggested, but
scatteris better suited for continuous data.It’s easier to rely on the axes color cycle and just call plot multiple times (you also get separate artists instead of a collection, which is a good thing for discrete data such as this).
By default, the color cycle for an axes is: blue, green, red, cyan, magenta, yellow, black.
After 7 calls to
plot, it will cycle back over those colors, so if you have more items, you’ll need to set it manually (or just specify the color in each call toplotusing an interpolated colorbar similar to what @Yann suggested above).