I’m processing some 3rd party HTML which is semi-structured marked-up text (bold, italics, etc).
Here’s a simplified sample of the structure:
<div>
<strong class="term">one</strong>
-
<b class="defs">
foo
<i class="pos">verb</i>
bar
<i class="pos">noun</i>
baz
<i class="pos">adjective</i>
blah
</b>
<br>
<strong class="term">two</strong>
... etc ...
</div>
In fact I’ve already processed it a bit to get it into this shape. I can handle the HTML elements OK, but I haven’t been able to figure out how to deal with the interleaved text and <i> elements.
I’m happy with a solution that either splits the “defs” around the <i>s, a solution which iterates through the parts, etc. I would prefer not to mix jQuery and “raw” DOM API calls due to browser-specific quirks, but I understand if I can’t avoid it. It seems from my shallow knowledge that jQuery doesn’t have as good support for marked-up text as for “structural” HTML …
Am I missing something obvious? This seems very hard to search for…
It turns out that in the real world data, the text runs and <i> nodes are always interleaved, but the first thing within the defs may be either one, and each text run can consist of one or more actual text nodes. This means that <i>s and text runs are not in matched pairs.
Good solutions might be to either add markup to each text run, or to iterate through, doing one thing for each <i> and another thing for each text run. I’m thinking jQuery.contents() with some node type checking must be the key…
You can do the following to retrieve all the texts into an array
Demo: http://jsfiddle.net/joycse06/Z5AgL/
The above code Gives you a list of all the
defsalong with the textnode andi.Update
Yeah you can do node type or name check using
this.nodeNameorthis.nodeTypeinside the map function.nodeTypefortextnodeis3. e.g. add this inside.map()and checkSo for this specific markup structure you can do the following to check if it’s
<i>ortextnodeDemo : http://jsfiddle.net/joycse06/Z5AgL/6/