I need to implement an in-memory tuple-of-strings matching feature in C. There will be large list of tuples associated with different actions and a high volume of events to be matched against the list.
List of tuples:
('one', 'four') ('one') ('three') ('four', 'five') ('six')
event (‘one’, ‘two’, ‘three’, ‘four’) should match list item (‘one’, ‘four’) and (‘one’) and (‘three’) but not (‘four’, ‘five’) and not (‘six’)
my current approach uses a map of all tuple field values as keys for lists of each tuple using that value. there is a lot of redundant hashing and list insertion.
is there a right or classic way to do this?
If you only have a small number of possible tuple values it would make sense to write some sort of hashing function which could turn them into integer indexes for quick searching.
If there are < 32 values you could do something with bitmasks:
If there are too many values to do a bitmask solution you could have an array of linked lists. Go through each item in the event. If the item matches key_one, walk through the tuples with that first key and check the event for the second key:
This code is in no way tested and probably has many small errors but you should get the idea. (one error that was corrected was the test condition for tuple match)
If event processing speed is of utmost importance it would make sense to iterate through all of your constructed tuples, count the number of occurrences and go through possibly re-ordering the key one/key two of each tuple so the most unique value is listed first.