I apologize for a second question on the same topic, but I’m confused. Is there a Clojure module that follows lxml, even loosely, or how-to documentation on how to walk through an XML file using Clojure?
In Python, I can open an XML file using the lxml module; parse my way through the data; look for tags like <DeviceID>, <TamperName>, <SecheduledDateTime>, and then peform an action based on the value of one of those tags.
In Clojure, I have been given excellent answers on how to parse using data.xml and then further reduce the data.xml-parsed information by pulling out the :content tag’s vals and putting the information in a tree-seq.
However, even that resultant data has other map tags embedded, which obviously do not respond to keys and vals functions.
I could take this data and use regular expression searches, but I feel I’m missing something much simpler.
The data right out of data.xml/parse (calling ret-xml-data) looks like this, using various (first parsed-xml) and other commands at the REPL:
[:tag :TamperExport]
[:attrs {}]
:content
#clojure.data.xml.Element{:tag :Header, :attrs {}, :content
(#clojure.data.xml.Element{:tag :ExportType, :attrs {},
:content ("Tamper Export")}
#clojure.data.xml.Element{:tag :CurrentDateTime,
:attrs {},
:content ("2012-06-26T15:40:22.063")} :attrs {},
:content ("{06643D9B-DCD3-459B-86A6-D21B20A03576}")}
Here is the Clojure code I have so far:
(defn ret-xml-data
"Returns a map of the supplied xml file, as parsed by data.xml/parse."
[xml-fnam]
(let [input-xml (try
(java.io.FileInputStream. xml-fnam)
(catch Exception e))]
(if-not (nil? input-xml)
(xmld/parse input-xml)
nil)))
(defn gen-xml-content-tree
"Returns a tree-seq with :content extracted."
[parsed-xml]
(map :content (first (tree-seq :content :content (:content parsed-xml)))))
I think I may have found a repeatable pattern to the data that will allow me to parse this without creating a hodgepodge:
xml-lib.core=> (first (second cl1))
#clojure.data.xml.Element{:tag :DeviceId, :attrs {}, :content ("80580608")}
xml-lib.core=> (keys (first (second cl1)))
(:tag :attrs :content)
xml-lib.core=> (vals (first (second cl1)))
(:DeviceId {} ("80580608"))
Thank you as always.
Edit:
Add some more testing.
The resulting data, if I ran through the tree-seq structure using a function like doseq, could probably now be parsed with actions taken.
First, it’s hard to tell exactly what you’re trying to do. When working on a programming problem, it helps both you and others helping you to have a “small case” you can present and solve before working towards a larger one.
From what it sounds like, you’re trying to pull the content out of certain elements and perform actions based on that content.
I put together a small XML file with some simple content to try things out on:
I designed it to be what I think is representative of some of the core challenges with the problem at hand – in particular, being able to do stuff at arbitrary levels of nesting in the XML.
Looking at the wonderful Clojure Cheatsheet, I found
xml-seq, and tried running it on theclojure.data.xml/parsed xml. The sequence went through each of the elements and then their children, making it easy to iterate over the XML.To pick out and work with particular items in a sequence, I like using
forloops with:when. :when makes it easy to just enter the body of the loop when certain conditions are true. I also make use of the “set as a function” semantics, which checks to see if something is in the set.This gets back a sequence of ([:item1 “data”] [:item2 “else”]) that can then easily be acted on in other ways.
One of the key things to try and keep in mind about Clojure is that you tend to not need any special API to do stuff – the core language makes it easy to do most, if not all, that you need to do. Records (which are what you see being returned) are also maps for example, so assoc, dissoc, and so on work on them, and it’s how they are intended to be worked with.
If this doesn’t help you get to what you need, then could you provide a small sample output and a sample result that you want?