I want to analyze a message’s type for best performance, the message is begin with constant string and one space followed. The constant strings belongs to one known list of string array, like “CUT”, “GET”, “LOGIN” …
So I do not like to memcmp(data, “GET”, 3) thing repeatedly which is bad for performance. I wonder is there any better solution. Maybe I can compile this constant string arrays into a DFA for quick string match, but I do not know how to do it, and is there any other better solution?
Possible use lexer to do this?
Take a look at Ragel. And at Mongrel for a real-world use. Though I found the mail parsing example that is enclosed with ragel to be a fun small one to experiment with, too.
Though, depending on your protocol, just a check on the first byte might get you down to a single subsequent memcmp() just to verify that your verb is indeed the correct one. ‘C’, ‘L’, ‘G’ are all different values.