How can I do a Linq query for this?
I have two xml documents, doc1.xml, and doc2.xml. How can I find for each “file” element in doc1 where doc2 has a “file” element with the exact same “path” attribute, but any “link” child element of this “file” in doc1 has an “absolutepath” attribute that is NOT the same as one or more “absolutepath” attributes in the corresponding “file” element in doc2?
simple example:
doc1:
<doc>
<file path="c:\temp\A.xml">
<link absolutepath="c:\temp\B.xml"/>
<link absolutepath="c:\temp\C.xml"/>
</file>
<file path="c:\temp\C.xml"> <!--This should match, because it's child link absolutepath is not the same as child link absolutepath of the corresponding file with the same path in doc2-->
<link absolutepath="c:\temp\D.xml"/>
<link absolutepath="c:\temp\F.xml"/>
</file>
</doc>
doc2:
<doc>
<file path="c:\temp\A.xml">
<link absolutepath="c:\temp\B.xml"/>
<link absolutepath="c:\temp\C.xml"/>
</file>
<file path="c:\temp\C.xml">
<link absolutepath="c:\temp\D.xml"/>
<link absolutepath="c:\temp\E.xml"/>
</file>
</doc>
Any ideas?
EDIT: Edited the example xml to show what I mean by multiple links for each file element. So what I want is each file in doc1 that has a link element with an absolutepath that is not found in a link element in doc2. So there are actually the same number of links in both, but the absolutepath may differ sometimes and that’s what I want to find and extract those files where there is such a difference in the link elements.
Here’s my attempt to modify the query suggested by Jon, to extract multiple links, but I think I’m doing it wrong, because I don’t get the correct result from the Except query afterwards:
var files = from file in doc1.Descendants("file")
select new
{
file = file.Attribute("path").Value,
link = file.Elements("link").Attributes("absolutepath")
};
var oldfiles = from file in doc2.Descendants("file")
from link in file.Elements("link")
select new
{
file = file.Attribute("path").Value,
link = file.Elements("link").Attributes("absolutepath")
};
//Get the ones that are different between them
var missing = files.Except(oldfiles);
Well, I would start with the XML part. I originally made this more complicated than it needs to be, but I think you can just use:
Then if you have
files1andfiles2(the above query applied to each document) you can just do:EDIT: To get back to the link elements for those files, you can use:
It’s a bit of a shame to query the document again, but there we go…
(I’ve selected the link element rather than the file element so you can get to exactly the right bit – you can always choose the parent element to get to the file.)
EDIT: Okay, if there are multiple link elements and you just want to find files with missing elements, that’s actually pretty easy from what we’ve got: