I’ve seen there are a lot of posts about XML comparison, but none of

Question

0

Asked: May 31, 20262026-05-31T01:47:21+00:00 2026-05-31T01:47:21+00:00

I’ve seen there are a lot of posts about XML comparison, but none of

0

I’ve seen there are a lot of posts about XML comparison, but none of the one’s I’ve looked at solve my problem.

We have some XML-formatted text documents (product descritptions, with headings and paragraphs) that are being updated (i.e. versioned), and I’ve been tasked with making change digests. That is, we want to take two sequent files and generate a third; the heading structure (outline) is to be preserved, but only paragraphs with changes are to be kept — additions as well as deletions should be marked up.

So I’ve been trying to find a way to walk both DOM trees and detecting additions and deletions, but I’m running into problems detecting them reliably. It’s obviously because I should be doing a diff — but I can’t use a plain diff because I want to do individual diffs inside each element, and because I can’t use a traditional diff result but need to have a fully formatted xml digest.

Any hints before I try to tackle the “Longest common subsequence problem”, which is going to be a huge task?

Report

Leave an answer
Cancel reply

You must login to add an answer.

Need An Account,

1 Answer

Editorial Team · Answer 1 · 2026-05-31T01:47:23+00:00

Turns out, my need had no solution at the time! Meanwhile, I’ve developed my own xml-diff routine that is specific to my problem, so I ended up with a working solution.

Then, in late 2011, this was published: Slashdot: Researchers Expanding Diff, Grep Unix Tools

Dartmouth computer scientists presented variants of the grep and diff Unix command line utilities that can handle more complex types of data. The new programs, called Context-Free Grep and Hierarchical Diff, will provide the ability to parse blocks of data rather than single lines. The research has been funded in part by Google and the U.S. Energy Department.

Sign Up

Sign In

Forgot Password

The Archive Base Latest Questions

I’ve seen there are a lot of posts about XML comparison, but none of

Leave an answerCancel reply

1 Answer

Leave an answer
Cancel reply