First, I am not a programmer.
I have a huge XML file with terms described thus:
<term>
<termId>MANUAL000399</termId>
<termUpdate>Add</termUpdate>
<termName>care</termName>
<termType>Pt</termType>
<termStatus>Active</termStatus>
<termApproval>Approved</termApproval>
<termCreatedDate>20120618T14:38:20</termCreatedDate>
<termCreatedBy>admin</termCreatedBy>
<termModifiedDate>20120618T14:40:41</termModifiedDate>
<termModifiedBy>admin</termModifiedBy>
</term>
In the file, terms have either
<termType>
Pt or ND
I would like the solution to apply to both.
what I would like to do is be able to go through, look at the word length in
termName
and if there are fewer than 5 characters in there, append another property, a
<termNote>
in after the
<termModifiedBy>
property:
<term>
<termId>MANUAL000399</termId>
<termUpdate>Add</termUpdate>
<termName>care</termName>
<termType>Pt</termType>
<termStatus>Active</termStatus>
<termApproval>Approved</termApproval>
<termCreatedDate>20120618T14:38:20</termCreatedDate>
<termCreatedBy>admin</termCreatedBy>
<termModifiedDate>20120618T14:40:41</termModifiedDate>
<termModifiedBy>admin</termModifiedBy>
<termNote label="Short">Short</termNote>
</term>
Can anyone advise what the best approach for this? I found regexes on here but the problem is the application of them, I found someone suggesting /\b[a-zA-Z]{5,}\b/ but I don’t know how to write a script that takes this and then inserts the termNote if it matches.
This transformation can be done by a simple XSLT stylesheet. (XSLT is a language that non-programmers often take to more enthusiastically than programmers. A stylesheet is basically a set of transformation rules: when you see something that matches X, replace it by Y. Of course, once you have mastered XSLT, you can call yourself a programmer).
First some boilerplate:
Then a default template rule that copies things unchanged if there’s no more specific rule:
Then a template rule that matches short terms:
and then finish off with:
You should be able to run this with any XSLT processor; there are plenty available. If nothing else comes to mind, download KernowForSaxon (from SourceForge) which is a very simple GUI interface around my Saxon processor.