I’am trying to measure the performance of a computer vision program that tries to detect objects in video. I have 3 different versions of the program which have different parameters.
I’ve benchmarked each of this versions and got 3 pairs of (False positives percent, False negative percent).
Now i want to compare the versions with each other and then I wonder if it makes sense to combine false positives and false negatives into a single value and use that to do the comparation. for example, take the equation falsePositives/falseNegatives and see which is smaller.
A couple of other possible solutions:
-Your false-positive rate (fp) and false-negative rate (fn) may depend on a threshold. If you plot the curve where the y-value is (1-fn), and the x-value is (fp), you’ll be plotting the Receiver-Operator-Characteristic (ROC) curve. The Area Under the ROC Curve (AUC) is one popular measure of quality.
-AUC can be weighted if there are certain regions of interest
-Report the Equal-Error Rate. For some threshold, fp=fn. Report this value.