I’m familiar with writing and reading my own XML files like e.g. for setting but I need to read data from a huge xml file but I can’t find my starting point.
<span class="mw-headline" id="Kader_der_Saison_2010.2F11.5B51.5D">
Kader der Saison 2010/11
<sup id="cite_ref-50" class="reference">
<a href="#cite_note-50">[51]</a>
</sup>
</span>
</h3>
<table class="wikitable" width="550px">
<tr bgcolor="#DDDDDD">
<th>Name</th>
<th>Trikot</th>
<th>Nationalität</th>
</tr>
<tr bgcolor="#EEEEEE">
<th colspan="3" align="left">Torwart</th>
</tr>
<tr bgcolor="#FFFFFF">
<td>
<a href="/wiki/Manuel_Almunia" title="Manuel Almunia">Manuel Almunia</a>
</td>
<td align="center">1</td>
<td align="center">
<span style="display:none" class="sortkey">Spanien !</span>
<a href="/wiki/Datei:Flag_of_Spain.svg" class="image" title="Spanier">
<img alt="Spanier" src="http://upload.wikimedia.org/wikipedia/commons/thumb/9/9a/Flag_of_Spain.svg/20px-Flag_of_Spain.svg.png" width="20" height="13" class="thumbborder" />
</a>
</td>
</tr>
- <tr bgcolor="#FFFFFF">
- <td>
<a href="/wiki/%C5%81ukasz_Fabia%C5%84ski" title="Łukasz Fabiański">Łukasz Fabiański</a>
</td>
<td align="center">21</td>
- <td align="center">
<span style="display:none" class="sortkey">Polen !</span>
- <a href="/wiki/Datei:Flag_of_Poland.svg" class="image" title="Pole">
<img alt="Pole" src="http://upload.wikimedia.org/wikipedia/commons/thumb/1/12/Flag_of_Poland.svg/20px-Flag_of_Poland.svg.png" width="20" height="13" class="thumbborder" />
</a>
</td>
</tr>
As you (maybe) can see i´m trying to read the names of all team members starting next to “Kader_der_Saison” right from the wikipedia.
I need the title or text of these elements
<a href="/wiki/Manuel_Almunia" title="Manuel Almunia">Manuel Almunia</a>
to get the names Manuel Almunia, Łukasz Fabiański, etc.
I’ve tried a a couple of ways, xmldocument.GetElementById or Name, XmlReader.NoteTyp, XmlReader.MoveToNextAttribute, xmldocument.SelectNode(xpath), even tried a linq query on the document but I don’t get to the position of the names.
Any ideas how the find the “Kader_der_Saison” position and read the following <a link text?
Thanks
This looks like HTML, not XML. Assuming that is correct, see this question.
If it really is Xml (and someone chose really bad tag names), load it in as an XmlDocument or XPathDocument and use XPath navigation to call out the nodes by name.
I don’t use XPathDocuments much, but with XmlDocument your code might look something like: