I have a program that handles several lists of elements (always with length > 4) that can have either an “up” or a “down” property each.
To put it into code:
mylist = [element1, element2, element3]
and each element has “up” or down” element (a simplification of the actual problem):
element1 = ["up", "down", "up", "up"]
element2 = ["down", "down","down", "down", "up"]
element3 = ["up", "up", "down", "down", "up", "up", "up"]
What I’m trying to find if there’s an algorithm or some method to infer a score that may be indicative of “direction” for the list itself by using the counts of “up” and “down” elements. The existing code (which I didn’t write) used a simple comparison of those two counts:
if count_up > count_down
return "up"
else if count_down > count_up
return "down"
Of course this is prone to size effects pretty badly (some lists can be of almost 100 elements, others of just 5) and also fails when both counts are equal. I’d prefer a numerical score. I looked at the Wilson score (the one used by Reddit) but it considers (as far as I can tell) success/failure, while the two states I mentioned can’t be defined like that.
Is there any existing statistic that I can use for this?
My immediate reaction would be something like
(number_up - number_down) / (number_up + number_down). This basically gives up or down as a percentage of the whole. The obvious shortcoming is that for a really short list, the percentage can be pretty high from a fairly small absolute difference (e.g., 3 up, 1 down).Edit: One possible way to keep small lists from excessively impacting overall scores is to add a couple of constants into the equation:
This lets you take both relative and absolute differences into account to some degree. For example, with 3 up/1 down, it’ll give 0.833. With 6 up/2 down (same ratio, but twice as many of each) it’ll give 1.4. At the same time, relative differences are still taken into account, so (for example) 10 up/1 down will give 2.9.
In effect, this retains the same general idea, but allows you to pick some degree (adjustable by changing the min_denom) to which you give extra weight to larger samples. Strictly speaking,
factorisn’t entirely necessary — it just helps keep the results in a convenient range.Of course, this may not be appropriate — for what you’re dealing with, a sample of four may carry the same weight as a sample of 100. Another possible shortcoming is that the result values become more open-ended, instead of a nice, neat -1..1.