I need wrapping each word with a tag (e. span) in a HTML document, like:
<html>
<head>
<title>It doesnt matter</title>
</head>
<body>
<div> Text in a div </div>
<div>
Text in a div
<p>
Text inside a p
</p>
</div>
</body>
</html>
To result something like this:
<html>
<head>
<title>It doesnt matter</title>
</head>
<body>
<div> <span>Text </span> <span> in </span> <span> a </span> <span> div </span> </div>
<div>
<span>Text </span> <span> in </span> <span> a </span> <span> div </span>
<p>
<span>Text </span> <span> in </span> <span> a </span> <span> p </span>
</p>
</div>
</body>
</html>
It’s important to keep the structure of the body…
Any help?
All of the three different solutions below use the XSLT design pattern of overriding the identity rule to generally preserve the structure and contents of the XML document, and only modify specific nodes.
I. XSLT 1.0 solution:
This short and simple transformation (no
<xsl:choose>used anywhere):when applied to the provided XML document:
produces the wanted, correct result:
II. XSLT 2.0 solution:
when this transformation is applied to the same XML document (above), again the correct, wanted result is produced:
III Solution using FXSL:
Using the
str-split-to-wordstemplate/function of FXSL one can easily implement much more complicated tokenization — in any version of XSLT:Let’s have a more complicated XML document and tokenization rules:
Here there is more than one delimiter that indicates the start or end of a word. In this particular example the delimiters can be:
" ",";",".",":","-","[","]".The following transformation uses FXSL for this more complicated tokenization:
and produces the wanted, correct result: