I got an XML file that is 9MB large. Apparently, it is broken.
I want to check if on any level 2 sibling elements have an attribute “Id” with same value.
Currently it goes too slow. What kind of optimizations I could make to this code?
Edited to include some tips
namespace ConsoleApplication1{
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Xml.Linq;
internal class Program{
private const string _pathToXml=@"C:\4\4";
private static readonly List<object> _duplicateLeafs=new List<object>();
private static void Main(){
var xml=ReadXml();
var elements=xml.Descendants();
foreach(var element in elements)
FindDupes(element);
Console.ReadLine();
Debugger.Break();
}
private static XDocument ReadXml(){
return XDocument.Parse(File.ReadAllText(_pathToXml));
}
private static void FindDupes(XElement element){
var elements=element.Descendants();
var elementsWithIds=elements.Where(x=>x.Attribute("Id")!=null);
var ids=elementsWithIds.Select(x=>x.Attribute("Id")).ToList();
for(var i=0;i<ids.Count;i++)
for(var j=i+1;j<ids.Count;j++)
if(i!=j&&ids[i]==ids[j])
_duplicateLeafs.Add(elementsWithIds.First(x=>x.Attribute("Id")==ids[i]));
foreach(var subElement in elements)
FindDupes(subElement);
}
}
}
You say you want to check level 2 descendants, but yet FindDupes is recursive, so you’re recursively checking two levels deep in foreach loops, every call.