I’m working a page that needs to fetch info from some other pages and then display parts of that information/data on the current page.
I have the HTML source code that I need to parse in a string. I’m looking for a library that can help me do this easily. (I just need to extract specific tags and the text they contain)
The HTML is well formed (All closing/ending tags present).
I’ve looked at some options but they are all being extremely difficult to work with for various reasons.
I’ve tried the following solutions:
- jkl-parsexml library (The library js file itself throws up HTTPError 101)
- jQuery.parseXML Utility (Didn’t find much documentation/many examples to figure out what to do)
- XPATH (The Execute statement is not working but the JS Error Console shows no errors)
And so I’m looking for a more user friendly library or anything(tutorials/books/references/documentation) that can let me use the aforementioned tools better, more easily and efficiently.
An Ideal solution would be something like BeautifulSoup available in Python.
Using jQuery, it would be as simple as
$(HTMLstring);to create a jQuery object with the HTML data from the string inside it (this DOM would be disconnected from your document). From there it’s very easy to do whatever you want with it–and traversing the loaded data is, of course, a cinch with jQuery.