Processing XML With Clojure
Although XML is nice in theory, I have always hated dealing with it. It requires so much boiler plate code just to parse or create a simple XML file. Recently I needed to do some XML processing and I still can't believe how easy it was to create and parse XML in clojure.
Clojure contrib includes a library for creating XML called prxml. Vectors become XML tags. Such as,
(prxml [:p {:class "greet"} [:i "Ladies & gentlemen"]]) ; => <p class="greet"><i>Ladies & gentlemen</i></p>
First let's define some data to turn in to XML.
(def data #{{:title "Clojure" :link "http://clojure.org" :description "Clojure Homepage"} {:title "Java" :link "http://java.sun.com" :description "JVM Homepage"} {:title "Debian" :link "http://debian.org" :description "Debian Homepage"}})
By default prxml function outputs to the screen if you want to output to a string use prxml in combination with with-out-str.
(defn articles [] (reduce (fn [feed v] (conj feed [:item [:title (:title v)] [:url (:url v)] [:description (:description v)]])) () data))
We build a list of vectors for every node in the XML.
[:item [:title "Clojure"] [:url "clojure.org"] [:description "Clojure Homepage"]] [:item [:title "Java"] [:url "java.sun.com"] [:description "JVM Homepage"]]
If you wrap everything, it takes less than 20 lines of code to produce an RSS feed.
(defn xml-data [] (with-out-str (prxml [:decl! {:version "1.0"}] [:rss {:version "2.0"} [:channel [:title "The Site"] [:link "http://site.com"] [:description "The Site"] (articles)]])))
Parsing XML is even easier, clojure core has built in support for XML processing. clojure.xml/parse can take a File, InputStream or String naming a URI and return a tree of the xml/element struct-map. You can then treat it like any other sequence.
Such as to iterate over all the titles in the XML file,
(let [input-stream (ByteArrayInputStream. (.getBytes (xml-data) "UTF-8"))] (for [x (xml-seq (parse input-stream)) :when (= :title (:tag x))] (:content x))) ;;rss=> (["site-title"] ["Clojure"] ["Java"] ["Debian"])