Using MarkLogic to pull in data from a web service with xdmp:http-get() or xdmp:http-post(), I’d like to be able to check the headers that come back before I attempt to process the data. In DQ I can do this:
let $result := xdmp:http-get($query,$options) (: $query and $options are fine, I promise. :)
return $result
And the result I get back looks like this:
<v:results v:warning="more than one node">
<response>
<code>200</code>
<message>OK</message>
<headers>
<server>(actual server data was here)</server>
<date>Thu, 07 Jun 2012 16:53:24 GMT</date>
<content-type>application/xml;charset=UTF-8</content-type>
<content-length>2296</content-length>
<connection>close</connection>
</headers>
</response>
followed by the actual response. the problem is that I can’t seem to XPath into this response node. If I change my return statement to return $result/response/code I get the empty sequence. If I could check that code to make sure I got a 200 back before attempting to process the actual data that came back it would be much better than using try-catch blocks to see if the data exists and is sane.
So, if anyone knows how to access those response codes I would love to see your solution.
For the record, I have tried xdmp:get-response-code(), but it doesn’t take any parameters, so I don’t don’t know what response code it’s looking at.
You’re getting burned by two gotchas at once:
First, the namespace. The XML output of the http-get function is in a namespace as seen by the top-level element:
To successfully access elements in that namespace, you need to declare a prefix in your query bound to the correct namespace, and then use that prefix in your XPath expressions. For example:
Now lets talk about document nodes. 🙂
You’re trying to access
$resultas if it is a document node containing an element, but in actuality, it is a sequence of two root nodes (so they’re not siblings either). The first one (the one you’re interested in here) is a parentless<response>element—not a document containing a<response>element.This is a common gotcha: knowing when a document node is present or not. Document nodes are always invisible when serialized (hence the gotcha), and they’re always present on documents stored in the database. However, when you just use a bare element constructor in XQuery (as the http-get implementation does), you construct not a document node but an element node without a document node parent.
For example, the following query will return the empty sequence, because it’s trying to get the
<foo>child of<foo>:On the other hand, the following does return
<foo>, because it’s getting the<foo>child of the document node (which has to be explicitly constructed, in XQuery):So you have to know how a given function’s API is designed (whether it returns a document containing an element or just an element).
To solve your problem, don’t try to access
$result/h:response/h:code(which is trying to get the<response>child of<response>). Instead, access$result/h:code(or more precisely$result[1]/h:code, since<response>is the first of a sequence of two nodes returned by the http-get function).For more information on document nodes, check out this blog article series: http://community.marklogic.com/blog/document-formats-part1