I recently read that a method involving hashing could be a good way to find the mode of an array of floating point numbers. I don’t know much about hashing or its applications and the text didn’t explain further.
Does anyone know what this method would involve?
Not being much of a statistics guy, the mode in this sense seems to be defined as ‘the value that occurs the most frequently’.
From that description, it seems the obvious way to implement this using a hash table would be to use the number as a key, and as the value stored, use a frequency counter.
Then just loop through each number, in Python pseudo-code this would be something like:
Then you’d need to extract the greatest value, and find its corresponding key:
This creates a sorted list of (value, key) pairs, sorted on value, and then extracts the key (
1) from the last ([-1]) pair.I didn’t bother using a defaultdict or equivalent, for illustration I think the
ifis rather helpful.NOTE: I make no guarantees that this is the canonical way of solving the problem. This is just what I’d do, off the top of my head, if someone asked me to solve the problem (and maybe hinted that using a hash is a good idea).