I have a folder with 400k+ XML-documents and many more to come, each file is named with ‘ID’.xml, and each belongs to a specific user. In a SQL server database I have the ‘ID’ from the XML-file matched with a userID which is where I interconnect the XML-document with the user. A user can have an infinite number of XML-document attached (but let’s say maximum >10k documents)
All XML-documents have a few common elements, but the structure can vary a little.
Now, each user will need to make a search in the XML-documents belonging to her, and what I’ve tried so far (looping through each file and read it with a streamreader) is too slow. I don’t care, if it reads and matches the whole file with attributes and so on, or just the text in each element. What should be returned in the first place is a list with the ID’s from the filenames.
What is the fastest and smartest methods here, if any?
I think LINQ-to-XML is probably the direction you want to go.
Assuming you know the names of the tags that you want, you would be able to do a search for those particular elements and return the values.
resultswould then contain anIEnumerableof the value of any XML tag that has has a name matching “tagName”The query could also be written like this:
or this:
The output would be the same, it is just a different way to filter based on the element name.