I’m trying to deserialize an XML sitemap and then load it into a Datagrid view but i’m running into some issues. For example, when i plug the following URL into my code:
http://www.allfancydress.com/googlesitemap.aspx
It runs fine and i get the desired result, however trying with a different URL:
http://store.cascadepools.co.uk/sitemap.aspx
yields unfavourable results, giving me the following error:
XmlException was unhandled: An error occurred while parsing EntityName. Line 408, position 142.
The code i am calling is as follows:
XmlReader reader;
XmlReaderSettings settings = new XmlReaderSettings();
settings.XmlResolver = null;
settings.DtdProcessing = DtdProcessing.Ignore;
settings.CheckCharacters = false;
reader = XmlReader.Create(tbGoogleSiteMap.Text,settings);
DataSet ds = new DataSet();
ds.ReadXml(reader);
Does anyone have any ideas?
Thanks
I think the answer you are looking for is very straightforward and you have just overlooked it by getting yourself bogged down in code (we all do it!). Recheck the two url’s you posted and what do you notice?
One is an XML sitemap the other is an ASP.NET webpage, definitely not what you meant to try and scrape I assume?!
I had a look at your robots.txt of store.cascadepools.co.uk and the correct url for the XML version of your sitemap appears to be http://store.cascadepools.co.uk/feeds/CascadePoolsStore_gs.xml at least that’s what your telling robots it is! I bet if you run the above URL through your code it will respond as you intend 😉
Hope that helps?