Say I had some XML that I wanted to convert to HTML. The XML is divided into ordered sections:
<?xml version="1.0" encoding="utf-8"?>
<root>
<section attr="someCriteria">
<h1>Title 1</h1>
<p>paragraph 1-1</p>
<p>paragraph 1-2</p>
</section>
<section attr="someOtherCriteria">
<h3>Subtitle 2</h3>
<ul>
<li>list item 2-1</li>
<li>list item 2-2</li>
<li>list item 2-3</li>
<li>list item 2-4</li>
</ul>
</section>
<section attr="anotherSetOfCriteria">
<warning>
Warning: This product could kill you
</warning>
</section>
<section attr="evenMoreCriteria">
<disclaimer>
You were warned
</disclaimer>
</section>
<section attr="criteriaSupreme">
<p>Copyright 1999-2011</p>
</section>
</root>
I have several of these XML documents.
I need to group and transform these sections based on criteria. There will be two different kinds of buckets.
- So the first section will go in a
bucket (e.g.<div)
class="FormatOne"></div> - If the
second section meets the criteria to
qualify for the “FormatOne” bucket it
will also go in this bucket - If the
third section requires a different
bucket (e.g.<div) then a new
class="FormatTwo"></div>
bucket is created and section
contents are placed in this bucket - If the bucket for the fourth section requires “FormatOne” (which is different than the previous format) then a new
bucket is created again and section
contents are placed in this bucket - etc. Each section would go into the same bucket as the previous section if they are the same format. If not, a new bucket is created.
So for each document, depending on the logic for separating buckets, the document may end up like this:
<body>
<div class="FormatOne">
<h1>Title 1</h1>
<p>paragraph 1-1</p>
<p>paragraph 1-2</p>
<h3>Subtitle 2</h3>
<ul>
<li>list item 2-1</li>
<li>list item 2-2</li>
<li>list item 2-3</li>
<li>list item 2-4</li>
</ul>
</div>
<div class="FormatTwo">
<span class="warningText">
Warning: This product could kill you
</span>
</div>
<div class="FormatOne">
<span class="disclaimerText"> You were warned</span>
<p class="copyright">Copyright 1999-2011</p>
</div>
</body>
this:
<body>
<div class="FormatOne">
<h1>Title 1</h1>
<p>paragraph 1-1</p>
<p>paragraph 1-2</p>
<h3>Subtitle 2</h3>
</div>
<div class="FormatTwo">
<ul>
<li>list item 2-1</li>
<li>list item 2-2</li>
<li>list item 2-3</li>
<li>list item 2-4</li>
</ul>
</div>
<div class="FormatOne">
<span class="warningText">
Warning: This product could kill you
</span>
<span class="disclaimerText"> You were warned</span>
<p class="copyright">Copyright 1999-2011</p>
</div>
</body>
or even this:
<body>
<div class="FormatOne">
<h1>Title 1</h1>
<p>paragraph 1-1</p>
<p>paragraph 1-2</p>
<h3>Subtitle 2</h3>
<ul>
<li>list item 2-1</li>
<li>list item 2-2</li>
<li>list item 2-3</li>
<li>list item 2-4</li>
</ul>
<span class="warningText">
Warning: This product could kill you
</span>
<span class="disclaimerText"> You were warned</span>
<p class="copyright">Copyright 1999-2011</p>
</div>
</body>
depending on how the sections are defined.
Is there a way to use an XSLT to perform this type of grouping magic?
Any help would be great.
Thanks!
I came up with a solution that involves hitting each section sequentially. The processing of each section is broken into two parts: a “shell” and a “contents” portion. The “shell” is responsible for rendering the
<div class="FormatOne">...</div>bits, and the “contents” is responsible for rendering the actual contents of the current section and all following sections until a non-matching section is found.When a non-matching section is found, control reverts to the “shell” template for that section.
This gives an interesting bit of flexibility: the “shell” templates may be very aggressive in what they match, and the “contents” sections may be more discerning. Specifically, with your first example output, you need the
warningelement to appear as<span class="warningText">...</span>, and this is accomplished with a more closely matching template.All “content” templates, after rendering the contents of their current section, call a named template that looks for the “next” appropriate content section. This helps consolidate the rules for determining what qualifies as a “matching” section.
You can see a working example here.
Here is my code, built to replicate what you asked for in your first example: