Given a string as a form of input (parsed from an input file) which represents a number and a mathematical operator (<, >, <=, >=, !, !=, and a few others), what is a really fast, efficient way to chop off that operator, compare it to a list of valid operators, and then set an “operator” variable to a state (i.e., Enum) representing the identified operator, then return just the number (as a string)?
I’m open to various ideas and implementations. I’ve tried several (about 6-7) myself, and find I’m not really satisfied with the speed. The fastest so far is a For Each loop that walks my list of “valid operators”, and compares that operator’s string representation against the chopped off bit from the numeric string. I determine the amount to chop off by the length of each valid operator in the valid list.
Here’s a code example of the fastest implementation. Assume input like <378 and a valid ops list of <, >, !, or >=79 and a valid ops list of <=, >=:
Friend Function FindMatchingOp(ByVal Haystack As String,
ByVal ValidOps() As <OperatorType>) As String
Dim tmpBit As String, tmpOpName As String, tmpOpLen As Int32
For Each tmpOp As <OperatorType> In ValidOps
tmpOpName = tmpOp.Name
tmpOpLen = tmpOpName.Length
tmpBit = Strings.Left(Haystack, tmpOpLen)
If String.Equals(tmpBit, tmpOpName) Then
<Code to set the correct operator>
Return Haystack.Remove(0, tmpOpLen)
Exit For
End If
Next
Return vbNullString
End Function
Not all of the numeric strings I expect to parse will utilize the same math operators (hence the need for the ValidOps variable). Some might only support < and >, others might do <=, >=, and !. This is why I cannot hardcode the assumption that the operator will be only one character in length, and have to test for both one-or-two character operators. I believe it’s these specific string checks that slow my other implementations down.
I’ve also tried putting ValidOps into things like a Dictionary, HashTable, ListDictionary, and even an Arraylist. The standard array beats all of them every time.
Thoughts?
PS, VB code only, please, in any advice or solutions.
EDIT: Not going to work for me.
I am going to try and implement a Trie to handle this and see what its performance is. I got the idea from this StackOverflow question.
You could somewhat improve your function by changing:
to…
But that is probably going to be marginal. All you’ll have is the removal of all of your intermediate strings.