Good afternoon,
I am writing a simple lexer which is basically a modified version of this one. After getting each token I need to perform slight modifications and re-analyse it to re-check it’s type. Also, of course, after the lexical analysis I need to re-use the whole token list to make a kind of "parsing" on it. My question is if using IEnumerable<Token> and yield return statements in the lexer can make the whole program’s performance slower… Would it be preferable to use a List<Token>, to build the list iteratively and use a normal return statement? What about iterating throught the IEnumerable/List? Which one is faster?
Thank you very much.
You are asking the wrong question, you should be worried far more about the cost of Regex. Enumerating the tokens will be a very small fraction of that, there’s just no point in optimizing code that could be double as fast but only improves program perf by 1%.
Write the code, profile it, you’ll know what to do for version 2. Given that these kind of tools run at ‘human time’ (no perceptible difference when the program takes twice as long when it needs 20 milliseconds), the most likely result is “nothing needs done”.