I have a large pool of objects with starting number and ending number. For example:
(999, 2333, data)
(0, 128, data)
(235, 865, data)
...
Assuming that the intervals don’t overlap with each other. And I am writing a function that takes a number and locate the object that (low, high) contains it. Say given 333, I want the 3rd objects on the list.
Is there any way I can do this as efficiently as possible, short of linear search? I was thinking about binary search, but having some difficulties of coping with the range check.
First of all, it is not at all clear that binary search is warranted here. It may well be that linear search is faster when the number of intervals is small.
If you’re concerned about performance, the prudent thing to do is to profile the code, and perhaps benchmark both methods on your typical inputs.
Disclaimers aside, binary search could be implemented by sorting the intervals once, and then repeatedly using the
bisectmodule to do the search:In the above, I assume that the intervals are non-overlapping, and that the interval includes both its start and end points.
Lastly, to improve performance, one might consider factoring
[interval[1] for interval in intervals]out of the function and doing it just once at the start.