I have an XML file that encodes a
directed acyclic graph
(DAG) that represents a partial order. Such graphs are useful for things like specifying dependencies and finding critical paths. For the curious, my current application is to specify component dependencies for a build system, so vertices are components and edges specify compile-time dependencies. Here is a simple example:
<?xml version="1.0"?>
<dag>
<vertex name="A">
<directed-edge-to vertex="C"/>
</vertex>
<vertex name="B">
<directed-edge-to vertex="C"/>
<directed-edge-to vertex="D"/>
</vertex>
<vertex name="C">
<directed-edge-to vertex="E"/>
</vertex>
<vertex name="D">
<directed-edge-to vertex="E"/>
</vertex>
<vertex name="E">
<directed-edge-to vertex="G"/>
</vertex>
<vertex name="F">
<directed-edge-to vertex="G"/>
</vertex>
<vertex name="G"/>
</dag>
This DAG may be drawn like this:

(source: iparelan.com)
I’d like to apply an XSLT stylesheet that produces another XML
document that contains only the vertices that correspond to minimal elements of the partial order. That is, those vertices that have no incoming edges. The set of minimal vertices for the example graph is {A, B, F}. For my build dependency application, finding this set is valuable because I know that if I build the members of this set, then everything in my project will be built.
Here is my current stylesheet solution (I’m running this with Xalan on Java using Apache Ant’s xslt task). A key observation is that a minimal vertex will not be referenced in any directed-edge-to element:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xslt"
exclude-result-prefixes="xalan">
<xsl:output method="xml" indent="yes" xalan:indent-amount="4"/>
<xsl:template match="dag">
<minimal-vertices>
<xsl:for-each select="//vertex">
<xsl:if test="not(//vertex/directed-edge-to[@vertex=current()/@name])">
<minimal-vertex name="{@name}"/>
</xsl:if>
</xsl:for-each>
</minimal-vertices>
</xsl:template>
</xsl:stylesheet>
Applying this stylesheet produces the following output (which I believe is correct):
<?xml version="1.0" encoding="UTF-8"?>
<minimal-vertices>
<minimal-vertex name="A"/>
<minimal-vertex name="B"/>
<minimal-vertex name="F"/>
</minimal-vertices>
The thing is, I’m not completely satisfied with this solution. I’m wondering if there is a way to combine the select of the for-each and the test of the if with XPath syntax.
I want to write something like:
<xsl:for-each select="//vertex[not(//vertex/directed-edge-to[@vertex=current()/@name])]">
But that does not do what I want because the current() function does not reference the nodes selected by the outer //vertex expression.
Thusfar, my solution uses XPath 1.0 and XSLT 1.0 syntax, though I’m open to XPath 2.0 and XSLT 2.0 syntax as well.
Here’s the Ant build script if you like:
<?xml version="1.0"?>
<project name="minimal-dag" default="default">
<target name="default">
<xslt in="dag.xml" out="minimal-vertices.xml" style="find-minimal-vertices.xsl"/>
</target>
<target name="dot">
<xslt in="dag.xml" out="dag.dot" style="xml-to-dot.xsl"/>
</target>
</project>
The dot target generates Graphviz Dot language code for rendering the graph. Here is xml-to-dot.xsl:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xalan="http://xml.apache.org/xslt"
exclude-result-prefixes="xalan">
<xsl:output method="text"/>
<xsl:template match="dag">
digraph {
rankdir="BT";
node [style="filled", fillcolor="cyan", fontname="Helvetica"];
<xsl:apply-templates select="//directed-edge-to"/>
}
</xsl:template>
<xsl:template match="directed-edge-to">
<xsl:value-of select="concat(ancestor::vertex/@name, '->', @vertex, ';')"/>
</xsl:template>
</xsl:stylesheet>
You can take advantage of XPath’s implicit existential quantification on the
=operator:When you use any of the six comparison operators (
=,!=,<,<=,>, and>=) to compare a node-set, the expression will return true if any node in the node-set satisfies the condition. When comparing one node-set with another, the expression returns true if any node in the first node-set satisfies the condition when compared with any node in the second node-set. XPath 2.0 introduces six new operators that don’t perform this existential quantification (eq,ne,lt,le,gt, andge). But in your case, you’ll want to use “=” to get that existential quantification.Note of course, that you’ll still want to use the
not()function as you were doing. Most of the time, it’s good to avoid the!=operator. If you used it here instead ofnot(), then it would return true if there are any@vertexattributes that are not equal to the@namevalue, which is not your intention. (And if either node-set is empty, then it would return false, as comparisons with empty node-sets always return false.)If you want to use
eqinstead, then you’d have to do something like you did: separate out the conditional from the iteration so you could bindcurrent(). But in XPath 2.0, you can do this within an expression:This is useful for when your condition isn’t a simple equality comparison (and thus can’t be existentially quantified using “
=“). For example:starts-with(@vertex, $v/@name).XPath 2.0 also has an explicit way of performing existential quantification. Instead of the
forexpression above, we could have written this:In addition to the “
some” syntax, XPath 2.0 also supplies a corresponding “every” syntax for performing universal quantification.Rather than using
for-each, you could also use template rules, which are more modular (and powerful):Again, in this case, we’re relying on the existential quantification of
=.XSLT 1.0 prohibits use of the
current()function in patterns, i.e., in thematchattribute, but XSLT 2.0 allows it. In that case,current()refers to the node currently being matched. So in XSLT 2.0, we could also write this (without having to use aforexpression):Note that this pattern is essentially the same as the expression you tried to use in
for-each, but whereas it doesn’t do what you want infor-each, it does do what you want in the pattern (because whatcurrent()binds to is different).Finally, I’ll add one more variation that in some ways simplifies the logic (removing
not()). This also goes back to using XSLT 1.0:If you don’t like the whitespace being output, add an empty rule for text nodes, so they’ll get stripped out (overriding the default rule for text nodes, which is to copy them):
Or you could just be more selective in what nodes you apply templates to:
Which approach you take is partially dependent on taste, partially dependent on the wider context of your stylesheet and expected data (how much the input structure might vary, etc.).
I know I went way beyond what you were asking for, but I hope you at least found this interesting. 🙂