I am scanning over a large amount of data and looking for common trends in it. Every time I meet a recurrence of a unit, I want to increment the count of it. What is the best data structure or way to hold this data. I need to be able to search it quickly, and also have a count with each unit of data.
Share
You didn’t specify a language, but a hash (associative array) is your best data structure.
It can sometimes be called a map/hashmap depending on a language (Java has HashMaps, Perl hash hashes, .
A hash/associative array/map data structure consists of a set of key-value pairs, with the values settable/gettable by the key. In you case, the key will be a string representing a word, a byte, or a double word (separate 3 hashmaps) and the value will be the count of the frequency.