CommonMark from Clojure + EDN front matter

09-Mar-2018

Lightweight, plain text markup languages like Markdown are great for writing simple content intended to be displayed as html. It's just easier to write, view and maintain then vanilla html.

Sure, Markdown has disadvantages but so does nearly any tool when used for something it's not meant for. Yes it's had a rocky past in terms of varying implementations and ambiguities. CommonMark is a recent effort to address those issues. Anyway enough of the background let's process some markdown with clojure!

Using commonmark-java

Guess what? in clojure we've got access to the extensive Java ecosystem and it just so happens there is an excellent CommonMark implementation for Java, commonmark-java. Let's add it to our dependencies file deps.edn

{:deps {org.clojure/clojure                 {:mvn/version "1.9.0"}
        com.atlassian.commonmark/commonmark {:mvn/version "0.11.0"}}}

Here is a simple clojure program src/demo.clj that grabs some input, parses it, renders it and prints out the result.

(ns demo
  (:import
    (org.commonmark.parser Parser)
    (org.commonmark.renderer.html HtmlRenderer)))

(def parser (.build (Parser/builder)))
(def renderer (.build (HtmlRenderer/builder)))

(defn -main [input]
  (->> (slurp input)
       (.parse parser)
       (.render renderer)
       (println)))

We can run it using the clojure cli tools. Let's point it at the awesome-clojure README.md and see what happens.

$ clj -m demo https://raw.githubusercontent.com/razum2um/awesome-clojure/master/README.md

<h1>Awesome Clojure <a href="https://github.com/sindresorhus/awesome"><img src="https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg" alt="Awesome" /></a></h1>
<ul>
<li>
<p><a href="#awesome-products-in-clojure">Awesome products in Clojure</a></p>
<ul>
<li><a href="http://lighttable.com/">LightTable (IDE)</a></li>
<li><a href="https://sekao.net/nightcode/">Nightcode (IDE)</a></li>

...

Well that's awesome.

EDN front matter trick

Often it's useful to define metadata that lives alongside markdown documents. You could argue that markdown should support metadata directly (say like in asciidoc) and you'll see static site generators sidestep this limitation by adding some front matter to the start of markdown files in various formats like JSON, YAML or TOML. But we are clojure folk and we prefer EDN right!

Ok so we want to shove a block of EDN at the start of the markdown file that we can extract and not have rendered. CommonMark has fenced code blocks. Could we do some sidestepping of our own and say if the first thing in the markdown document is a clojure code block extract it as EDN?

Here is and example markdown file with EDN front matter edn.md

```clojure
{:title "EDN Frontmatter Example"
 :published #inst "2018-03-08"}
```
 
This is the edn frontmatter example

And a clojure program src/demo2.clj. The only significant difference here from the last program is the extract-meta! function which looks at the first child node and un-links it if it's a clojure fenced code block returning the contents as EDN.

(ns demo2
  (:require
    [clojure.edn :as edn]
    [clojure.pprint :refer [pprint]])
  (:import
    (org.commonmark.parser Parser)
    (org.commonmark.node FencedCodeBlock)
    (org.commonmark.renderer.html HtmlRenderer)))

(defn extract-meta! [doc]
  (let [node (.getFirstChild doc)]
    (when (and (instance? FencedCodeBlock node)
               (= (.getInfo node) "clojure"))
      (.unlink node)
      (edn/read-string (.getLiteral node)))))

(def parser (.build (Parser/builder)))
(def renderer (.build (HtmlRenderer/builder)))

(defn -main [input]
  (let [doc   (.parse parser (slurp input))
        meta  (extract-meta! doc)
        html  (.render renderer doc)]
    (pprint (assoc meta :html html))))

When executed a map of the metadata is pretty printed that includes the rendered html (minus our EDN front matter).

$ clj -m demo2 edn.md 
{:title "EDN Frontmatter Example",
 :published #inst "2018-03-08T00:00:00.000-00:00",
 :html "<p>EDN frontmatter example</p>\n"}

Nice. This is pretty flexible. For example you could imagine the metadata taking the form of a datomic/datascript transaction.

This post has an associated Gist with runnable code examples. Please leave comments there if you have any.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.