I’d like to convert the output of diff (on a Markdown file) to
Markdown with <strike> and <em> tags, so that I can see what has
been removed from or added to a new version of a document. (This kind of
treatment is very common for legal documents.)
Example of hoped-for output:
Why do we
Westudy programming languages?notNot in order to …
One of the many
difficulties is that diff’s output is line-oriented, where I want to
see differences in individual words. Does anyone have suggestions as
to what algorithm to use, or what software to build on?
Use wdiff. It already does the word-by-word comparison you’re looking for; converting its output to markdown should take just a few simple regular expressions.
For example:
Edit: Actually, wdiff has some options that make it even easier: