I have a huge file that I must parse line by line. Speed is of the essence.
Example of a line:
Token-1 Here-is-the-Next-Token Last-Token-on-Line ^ ^ Current Position Position after GetToken
GetToken is called, returning ‘Here-is-the-Next-Token’ and sets the CurrentPosition to the position of the last character of the token so that it is ready for the next call to GetToken. Tokens are separated by one or more spaces.
Assume the file is already in a StringList in memory. It fits in memory easily, say 200 MB.
I am worried only about the execution time for the parsing. What code will produce the absolute fastest execution in Delphi (Pascal)?
Here’s a sample lexer that should be pretty efficient, but it assumes that all source data is in a single string. Reworking it to handle buffers is moderately tricky due to very long tokens.