Consider a vector V riddled with noisy elements. What would be the fastest (or any) way to find a reasonable maximum element?
For e.g.,
V = [1 2 3 4 100 1000]
rmax = 4;
I was thinking of sorting the elements and finding the second differential {i.e. diff(diff(unique(V)))}.
EDIT: Sorry about the delay.
I can’t post any representative data since it contains 6.15e5 elements. But here’s a plot of the sorted elements.
By just looking at the plot, a piecewise linear function may work.
Anyway, regarding my previous conjecture about using differentials, here’s a plot of diff(sort(V));
I hope it’s clearer now.
EDIT: Just to be clear, the desired “maximum” value would be the value right before the step in the plot of the sorted elements.


NEW ANSWER:
Based on your plot of the sorted amplitudes, your
diff(sort(V))algorithm would probably work well. You would simply have to pick a threshold for what constitutes “too large” a difference between the sorted values. The first point in yourdiff(sort(V))vector that exceeds that threshold is then used to get the threshold to use forV. For example:Another alternative, if you’re interested in toying with it, is to bin your data using HISTC. You would end up with groups of highly-populated bins at both low and high amplitudes, with sparsely-populated bins in between. It would then be a matter of deciding which bins you count as part of the low-amplitude group (such as the first group of bins that contain at least X counts). For example:
OLD ANSWER (for posterity):
Finding a “reasonable maximum element” is wholly dependent upon your definition of reasonable. There are many ways you could define a point as an outlier, such as simply picking a set of thresholds and ignoring everything outside of what you define as “reasonable”. Assuming your data has a normal-ish distribution, you could probably use a simple data-driven thresholding approach for removing outliers from a vector
Vusing the functions MEAN and STD: