Take a simple OrderDetail table which has a Quantity and UnitPrice for each record. To get the total value of each Order with SQL it’s simple as
SELECT OrderID, SUM(UnitPrice*Quantity)
FROM OrderDetail
GROUP BY OrderID
After converting the table to a XML file, using XQUERY I’m able to get the same information like this
for $orderId in distinct-values(doc('orderDetails.xml')//orderDetails/OrderID)
let $totalValue :=
sum(
for $detail in doc("orderdetails.xml")//OrderDetails[OrderID =$orderId]
return $detail/Quantity * $detail/UnitPrice
)
return <order id="{$orderId}" totalValue="{$totalValue}" />
Ignoring the sheer stupidity of getting the information like this, is there a better way to write the XQUERY expression? As it is it’s very, very slow (I’m using XMLSpy).
This is a major flaw in XQuery 1.0, and for this reason a group-by clause has been added in XQuery 1.1, they have added a group by clause, so your query would look like this:
Unfortunately however, XQuery 1.1 is still just a working draft, and few implementations are available.
Our implementation (XQSharp) tries to spot the pattern you used and do the group by more efficiently (this shows up as a group-by in the query plan). Unfortunately though, our implementation does not spot a group by in your particular case.
The first problem is casing differences (“orderdetails.xml” vs “orderDetails.xml”,
//orderDetailsvs//OrderDetails) – I’ll assume these are just typos.The biggest problem is that what you have written is not a trivial group by!
Unless you are using a schema which indicates otherwise, static analysis cannot determine that each node has exactly one OrderID, and the atomized value of OrderID may have more than one item (if it has a list as its schema type). This means that static analysis of
distinct-values(doc('orderDetails.xml')//orderDetails/OrderID)cannot determine that each node has exactly one key.In order to fix this your query can be written as follows:
This query then has the same semantics as a group-by and should be optimized as such. As it happens this still doesn’t optimize to a group by in XQSharp, so I have filed this as a bug against our software. Whether XmlSpy performs this optimization or not, I couldn’t say.