So, I started with this: http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#Ruby
Which works great for really small strings. But, my strings can be upwards of 10,000 characters long — and since the Levenshtein Distance is recursive, this causes a stack too deep error in my Ruby on Rails app.
So, is there another, maybe less stack intensive method of finding the similarity between two large strings?
Alternatively, I’d need a way to make the stack have much larger size. (I don’t think this is the right way to solve the problem, though)
Consider a non-recursive version to avoid the excessive call stack overhead. Seth Schroeder has an iterative implementation in Ruby which uses multi-dimensional arrays instead; it appears to be related to the dynamic programming approach for Levenshtein distance (as outlined in the pseudocode for the Wikipedia article). Seth’s ruby code is reproduced below: