I’m faced with a slight inconvenient ‘lag’ when I attempt to populate a div created in JavaScript:
var el = document.createElement("div");
el.innerHTML = '<insert string-HTML code here>'
However, this is natural due to extent of the HTML code; sometimes it’s more than 300,000 characters long and it is derived from GM_xmlHttpRequest which sometimes takes 1000ms (give or take) to complete, plus the additional 500ms caused by the DOM-ification.
I have attempted to get rid of massive amount of text using substr (granted not the best idea that could’ve occurred to me), and it surprisingly worked for the most part, but at certain times element would fail to accept HTML code (probably unmatched <*.?>).
I only need to access an extremely small amount of text that’s stored inside; regexp is per bobince out of the question and figured this would be the best approach.
EDIT: I’m inclined to mention that my definition of parsing the DOM has been underrated, I meant to say that this ‘text’ was the textContent of a quite a few elements which I modify. Therefore, regexp isn’t an option.
While other ansers focus on guessing whether your desire (parsing DOM without string manipulation) makes sense, I will dedicate this answer to the comparison of reasonable DOM parsing methods.
For a fair comparison, I assume that we need the
<body>element (as root container) for the parsed DOM. I have created a benchmark at http://jsperf.com/domparser-vs-innerhtml-vs-createhtmldocument.The first method is your current one. It is wel-supported accross all browsers.
Even though the second method has the overhead of creating a full document, it has a big benefit over the first one: resources (images) are not loaded. The overhead of the document is marginal compared to the potential network traffic of the first one.
The last method is -as of writing- only supported in Firefox 12+ (no problem, since you’re writing a GreaseMonkey script), and is the specific tool for this job (with the same advantages of the previous method). As it name implies, it is a DOM parser.
The bench mark shows that the original method is the fastest 4.64 Ops/s, followed by the DOMParser method 4.22 Ops/s. The slowest method is the
createHTMLDocumentmethod 3.72 Ops/s. The differences are minimal though, so I definitely recommend theDOMParserfor the reasons stated earlier.I know that you’re using
GM_xmlhttprequestto fetch data. However, if you’re able to useXMLHttpRequestinstead, I suggest to give the following method a try: Instead of getting plain text as response, you can get a document as a response:If Greasemonkey script is long active on a single page, you can still use this feature for other domains which do not support CORS: Insert an iframe in the document whose domain is equal to the other domain (eg
http://example.com/favicon.ico), and use it as a proxy (activate the GM script for this page as well). The overhead of insering an iframe is significant, so this option is not viable for one-time requests.For same-origin requests, this option may be the best one (although not benchmarked, one can argue that returning a document directly instead of intermediate string manipulation offers performance benefits). Unlike the
DOMParser+text/html method, theresponseType="document"is supported by more browsers: Chrome 18+, Firefox 11+ and IE 10+.