Monad Protocols

I’ve used Konrad Hinson’s clojure.algo.monads library for several years now and I’ve found it very useful for working with monads. Konrad did an excellent job implementing the monad paradigm in Clojure.

However, there have a been a couple of things that were less than ideal; performance, lack of laziness with m-plus and the need to specify the monad you’re using for domonad/with-monad.

I’ve often thought about how to do an implementation of monads in Clojure using protocols but never was successful at it. This discussion came up recently among my compatriots at Lonocloud and I came to same dead end as before. Chris Houser made an interesting comment about how protocols could be used, but I didn’t immediately see how that insight would help.

Yesterday, inspiration struck and here are the results.

The Monad Protocol

Clojure protocols let you specify an interface that a type has to implement. For a monad, the most basic interface a monad type must implement are the functions result and bind. That protocol looks like:

(defprotocol Monad
  (do-result [_ v])
  (bind [mv f]))

The do-result method is only for use in monad comprehension, so we’ll skip it for now. But the bind method is what makes a type a monad. Chris’s insight was that for bind (and plus), the first argument is always a monadic value. So bind could be a polymorphic function. Which is all well and good, but my hangup had always been with the result function. For a monad, result and bind must work together. But result can’t be polymorphic because it takes a value of any type and generates a monadic value. So I couldn’t figure out a way to write it.

One thing that makes monads in Haskell so easy is Haskell’s type system. Clojure doesn’t have a type system, but Java does, so we can leverage Java’s type system through Clojure protocols. The inspiration that struck me yesterday is that, if you consider a type that implements the monad protocol as a container, result is just a form of constructor for that type. (Brilliant, I know, but it had never occurred to me. Seems obvious now.) So you don’t really need a polymorphic result function.

With that piece of the puzzle, it’s child’s play to extend PersistentVector to be a monad.

(extend-type clojure.lang.PersistentVector
  (do-result [_ v]
  (bind [mv f]
        (vec (mapcat f mv))))

So now, the following works. (Assuming you’ve required the monad namespace as ‘m’.)

(m/bind [1 2 4] (fn [x] [(inc x)]))
=> [2 3 5]

It’s just as easy to extend PersistentList and PersistentHashMap to be monads as well.

Implementing Comprehension

Most people don’t use bind directly. Instead, they use monad comprehension. Monad comprehension is built up using bind and result. (In Haskell, it’s referred to as ‘do notation’, so I named the comprehension macro ‘do’ even though it conflicts with Clojure’s ‘do’.)

(defmacro do [bindings expr]
  (let [steps (reverse (partition 2 bindings))
        example (->> steps
    (reduce (fn [expr [sym mv]]
                `(bind ~mv (fn [~sym] ~expr)))
            `(do-result ~example ~expr) 

This looks complicated, but it’s not as bad as it looks. example is just an example of the monad values this comprehension is being applied to. This is passed to do-result to polymorphically create a monadic value of the proper type as the return value. do-result is typically only used here.

It’s easier to see what this all does using macroexpand.

  [x [1 2 3]
   y [10 11]]
  (+ x y))

expands to this.

(m/bind [1 2 3] (fn [x]
                  (m/bind [10 11] (fn [y]
                                    (m/do-result [10 11]
                                                 (+ x y))))))

and gives the output.

[11 12 12 13 13 14]

And polymorphism in action:

  [x #{1 2 3}
   y #{10 11}]
  (+ x y))
=> #{11 12 13 14}

Function containers

All the above is fine if the monad types are distinct. But both the state and continuation monads use a function as their container types. So we need to define a new type for each of those. And that’s a little more complicated. Here’s code:

(declare state)

(deftype state-binder [mv f]
  (invoke [_ s]
          (let [[v ss] (mv s)]
            ((f v) ss)))
  (do-result [_ v]
          (state v))
  (bind [mv f]
        (state-binder. mv f)))

(deftype state-monad [v]
  (invoke [_ s]
          [v s])
  (do-result [_ v]
          (state v))
  (bind [mv f]
        (state-binder. mv f)))

(defn state [v]
  (state-monad. v))

Starting at the bottom, state is just a function that will return a state monadic value given any value v. It does this by calling the constructor for the state-monad type. state-monad is a class define using Clojure’s deftype and it implements two interfaces, IFn and Monad. Instances of the state-monad class act like containers for the value v and return a vector when they’re invoked.

(def state-mv (m/state 6))

(state-mv :state-value)
=> [6 :state-value]

Notice that it doesn’t matter what type the state value is.

Implementing bind is where it get’s tricky. What bind returns also has to the Ifn and Monad protocols, but it needs to contain both a monadic value and a function. So another class is needed, state-binder. The only difference is state-binder’s invoke method does the proper thing. You can check out the state monad blog post for an explanation. The problem with state binder is that it needs to be able to construct instances of state-monad in its do-result method, so we forward declare state and then define it later. Simple. :)

Comprehension of the state monad looks like:

(def state-mv (m/do  
                 [x (m/state 9)
                  y (m/state 2)]
                 (+ x y)))

(state-mv :state-val)
=> [11 :state-val]

The continuation monad is defined in much the same way.

One more thing

Some monads have the concept of zero. There’s a protocol for that.

(defprotocol MonadZero
  (zero [_])
  (plus-step [mv mvs]))

Then plus can be defined like:

(defn plus [[mv & mvs]]
  (plus-step mv mvs))

It’s polymorphic and the implementations can be as lazy as you like.

The code

You can get the current code from github. The core namespace is intended to be required/as so the public functions can be distinguished from core Clojure functions. I’ve purposely made this namespace a little incompatible with algo.monads. You can’t just import it and expect it to work. On the other hand, converting a name space that uses algo.monads to this one, should be straight forward.

I haven’t finished implementing all the functionality from algo.monads yet. I’ll be working on that in the future. In the meantime, I’d love to hear your feedback. Another dilema I have is what name space to use. monads2.core is just temporary. So any suggestions would be appreciated.

Jim Duey 03 June 2012
blog comments powered by Disqus