In reading this paper, I have come across the term “symbol table”. At first, I thought it was just another word for a dictionary, but I grow less certain, as I try implement the diff algorithm described in the paper.
I have scoured the internet for an intelligible explanation, but I have come up short.
Could someone explain what is meant by a symbol table in the paper, and perhaps offer a basic implementation of it (the data structure, not the algorithm) in Python? The relevant description in the paper is in heading 3 “The Algorithm”
John Resig (@john-resig) offers an implementation of the algorithm in JavaScript, but my proficiency in JavaScript is too limit to use his implementation to wrap my head around the data structure.
A “symbol table” is just what the name implies, a table of symbols. It’s often implemented as an associative table, like a Python dictionary. Symbol tables are common in e.g. compilers, where you have to map things like variable and function names to their internal structures.
In relation to the paper you link to, the symbol table (dictionary) is indexed by the text of a line, and the data of each index is a pair of counters.