Is there an easy way to do a cumulative plot of many variables with matplotlib/numpy?
I’m thinking of a graph like this http://atlassian.wpengine.netdna-cdn.com/jira/cumulative-flow-diagram.png.
For example I have data a=[0,3,6], b=[0,3,4] and this is supposed to become a count plot
[(0, a=1, b=1), (3, a=2, b=2), (4, a=2, b=3), (6, a=3, b=3)]. Therefore no binning, but rather all x-values get a point with the count of a particular variable below this value. The a and b values should be stacked above each other.
I can imagine how to implement a complicating interlacing preprocessing with bisect, but I can’t see an easy solution.
Any suggestions?
EDIT: Another explanation of the accumulated counting:
I have multiple data rows with x values. E.g. a=[0,3,6], b=[0,3,4], c=[1, 7]
I need a graph for each data row. The possible value of x coordinates for the plot is the union of all data row values. Here [0,1,3,4,6,7].
For each of these total x value the y value for a particular row would be how many of the value in that data are below the x coordinate. Therefore for the x coordinates x=[0,1,3,4,6,7] I’d get ya=[1,1,2,2,3,3], yb=[1,1,2,3,3,3], yc=[0,1,1,1,1,2].
And of course I will use the stacked plot 🙂
The type of plot you mentioned in the link can be achieved with
stackplot. See eg this example of the gallery: http://matplotlib.org/examples/pylab_examples/stackplot_demo.htmlWhat you mean with the example data is not all clear to me. Can you give a more elaborate example of the data you have and what you want to obtain?
EDIT: A very simple approach:
But probably there will be more efficient ways. Do you use numpy or pandas to do the analysis? Or with plain lists?