I’ve got a large set of data points (100,000+) stored in a 2-dimensional numpy

Question

0

Asked: May 28, 20262026-05-28T01:49:10+00:00 2026-05-28T01:49:10+00:00

I’ve got a large set of data points (100,000+) stored in a 2-dimensional numpy

0

I’ve got a large set of data points (100,000+) stored in a 2-dimensional numpy array (1st column: x coordinates, 2nd column: y coordinates). I’ve also got several 1-dimensional arrays storing additional information for each data point. I’d now like to create plots from subsets of these 1D arrays which include only the points which are in a given polygon.

I came up with the following solution which is neither elegant nor fast:

#XY is the 2D array.
#A is one of the 1D arrays.
#poly is a matplotlib.patches.Polygon

mask = np.array([bool(poly.get_path().contains_point(i)) for i in XY])

matplotlib.pylab.hist(A[mask], 100)
matplotlib.pylab.show()

Could you please help me to improve this code? I tried playing around with np.vectorize instead of the list comprehension but could not manage to get it to work.

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-28T01:49:10+00:00

Editorial Team

2026-05-28T01:49:10+00:00Added an answer on May 28, 2026 at 1:49 am

Use matplotlib.nxutils.points_inside_poly, which implements a very efficient test.

Examples and further explanation of this 40-year-old algorithm at the matplotlib FAQ.

Update: Note that points_inside_poly is deprecated since version 1.2.0 of matplotlib. Use matplotlib.path.Path.contains_points instead.

0

Reply
Share
Share

- Report

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve got a large set of data points (100,000+) stored in a 2-dimensional numpy

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply