I have code which uses the StreamReader to read HTML from a file, then calls the ReadToEnd() function. The HTML is stored as a string.
Then I call this line of code:
string bookmarksBar = HTMLDoc.Substring(HTMLDoc.IndexOf(">Bookmarks bar</H3>"), HTMLDoc.IndexOf("</DL><p>"));
So what’s happening here is that I want a particular section of the HTML, so I’m using the string Substring method. The first argument is the startIndex, and the second argument is the length.
I am using the IndexOf methods so that this line of code will return a section of text which should be between ">Bookmarks bar</H3>" and "</DL><p>"
And so the end of the returned string should be where "</DL><p>" is found, right?
The problem then is that the string does not end where </DL><p> is found, but ends 323 characters later, at this line (I have inserted four asterisks to illustrate where the returned string ends):
ICON="data:image/png;base64,iVBORw0KGgoAAA****ANSUhEUgAAABAAAAAQCAYAAAAf8/9hAAABbklEQVQ4je3RPWuTYQCF4fs875uYKEilOA
I can’t make sense of why it’s ending here, since the string does not match "</DL><p>" at this point.
So here is a bigger section of the HTML:
jNpXrXKt4WFgn/KY1J1yBg874KWb0Vmr+BSttzgKt3LuBAAAAAElFTkSuQmCC\"></A>\r\n </DL><p>\r\n <DT><H3 ADD_DATE=\"1282073650\" LAST_MODIFIED=\"1301438557\">Link 1</H3>\r\n <DL><p>\r\n <DT><H3 ADD_DATE=\"1282073650\" LAST_MODIFIED=\"1286905747\">Link2</H3>\r\n <DL><p>\r\n <DT><A HREF=\"http://creators.xna.com/en-GB/create_detail#tour_four\" ADD_DATE=\"1282073650\" ICON=\"data:image/png;base64,iVBORw0KGgoAAA"
You can see the "</DL><p>" in the above HTML, so why doesn’t it stop at that point, instead of stopping at “KGgoAAA”?
Any ideas?
Thanks
You answered your own question.
The second argument is not the endIndex.
Also, the way you’re calling this, you will end up getting the text
">Bookmarks bar</H3>"in your result. Try this: