I’ve got a large set of data points (100,000+) stored in a 2-dimensional numpy array (1st column: x coordinates, 2nd column: y coordinates). I’ve also got several 1-dimensional arrays storing additional information for each data point. I’d now like to create plots from subsets of these 1D arrays which include only the points which are in a given polygon.
I came up with the following solution which is neither elegant nor fast:
#XY is the 2D array.
#A is one of the 1D arrays.
#poly is a matplotlib.patches.Polygon
mask = np.array([bool(poly.get_path().contains_point(i)) for i in XY])
matplotlib.pylab.hist(A[mask], 100)
matplotlib.pylab.show()
Could you please help me to improve this code? I tried playing around with np.vectorize instead of the list comprehension but could not manage to get it to work.
Use matplotlib.nxutils.points_inside_poly, which implements a very efficient test.
Examples and further explanation of this 40-year-old algorithm at the matplotlib FAQ.
Update: Note that
points_inside_polyis deprecated since version 1.2.0 of matplotlib. Use matplotlib.path.Path.contains_points instead.