# Building Trees

After all the posts on the logic arrow, we’re finally to the point of implementing a rudimentary core.logic. Way back at the start of this series, I said that this is only a proof of concept and that hasn’t changed. I’ve got other things to work on that I find more interesting, but there’s a huge potential here for someone to make core.logic the best logic programming system out there. Well, it’s already that, but to make it even better. So, if you’re inclined in that direction and want to really expand your abilities with monads, arrows and core.logic, let me know and I’ll be glad to assist. I would love to see someone take this an run with it.

This is quite long, but I wanted to finish this series and move on to something else. And if you think it’s too much trouble, keep in mind that we are implementing a way to automatically eliminate many of the errors that plague logic programming thereby saving enormous amounts of programmer time and frustration.

### The magic

To start with, take a look at this code and corresponding tree:

``````(defno v [v] (== v 15))
(defno w [w] (== w 10))
(defno x [x] (== x :x))
(defno y [y] (== y :y))
(defno z [z] (== :z z))
(def p (fresh [a b c q]
(z a)
(conde
[(x b)]
[(y b)])
(w q)
(v q)))
``````

This is nonsense code used strictly as a means to produce the tree. Expression `p` can be read as the call to `z` followed by the calls to `x` and `y` in parallel with both results being fed to `w` and then to `v`.

Or, you can visualize an empty substitution dropping into the top of the tree and falling through each node as it moves towards the leaves, picking up unifications on the way. At the branching node, a copy is passed to each branch.

So even though `w` and `v` only appear once in the code, they appear twice in the tree because they follow the `conde` expression which has two clauses. And our task in evaluating the code is to build a representation of the tree using the logic arrow.

To do this, notice that each of the functions `v` through `z` are created by `defno`. I’ll explain that later, but for now it will suffice to say that they each take their argument and (sort of) return an arrow value. So `fresh` somehow will use arrow-seq to combine those arrow values sequentially. But it has to work from the bottom up, so it knows that `v` follows `w` follows the `conde` expression, etc. This is a classic use of the continuation monad. So what functions defined by `defno` return are actually arrow values wrapped in the continuation monad. Simple, no?

### Except …

It’s not that simple because we have another requirement and it’s actually the point of this whole exercise.

Consider this code and the resulting tree:

``````(declare top)
(defno bottom [x]
(conde
[(top x)]
[(v x)]))

(defno top [x]
(bottom x))
``````

`top` and `bottom` are mutually recursive and `v` is same as above. In this case, `bottom` is the continuation of `top`, but `top` is used in the definition of `bottom`. To eliminate divergence, you need to detect this recursion. And you can only do this by building the tree from the root downward and keeping track of which functions have been called, but not completed, before being called again. Which is easy to do by using the state monad to keep track of which functions are ‘active’. And when you see an active function being called again, you know it’s a recursive call so you can tag the resulting arrow with a `:recursive` meta data tag. But how to do this?

### The Magic

And here is why I love monads, we have two requirements, each with their own form of complexity and each being solved by a different monad. It would be possible to brute force our way through using some kind of mutable state and a stack, and hacking a path through the untracked jungle on our own.

Or we can use a monad, which gives a map (no pun intended :)) of how to proceed. And even though the path may be difficult, it is there and there are trail markers to tell us when we’ve lost our way.

So, we create a monad that combines both the state and continuation monad:

``````(def tree-m (m/state-t m/cont))
``````

This just uses the state monad transformer to generate the needed monad.

With this in hand, we can revisit our basic goals. First failure:

``````(def fail-p (logic-arr. (constantly logic-zero)
{:results 0}))
(def fail (tree-m fail-p))
``````

We assign the logic arrow value to `fail-p` so we can use it later. And our fail goal is just that value wrapped in our tree monad.

And our unification goal creator is just:

``````(defn == [u v]
(tree-m
(logic-arr. (fn [a]
(if-let [ap (unify a u v)]
(logic-m ap)
logic-zero))
{})))
``````

### Some complexity

Here’s where the path gets a little steep. Any time you use a continuation monad, you’ve got to have a way to get at the continuation when you need to or else there’s no point in using the monad in the first place. For us, this is complicated by the fact the our continuation monad is wrapped in the state monad. But it’s fairly straight forward what needs to written.

``````(defn tree-cc [f]
(state-transformer. m/cont nil
(tree-m nil)
(fn [_]
(fn [m]
(fn [c]
(f m c))))
nil))
``````

There are two pieces of contexts our tree monad is managing for us; the state `m` and the continuation `c`. So `tree-cc` simply takes a function `f` that accepts as parameters both `m` and `c` and returns a `tree-m` monadic value. By supplying various kinds of functions to `tree-cc`, we can create values in our tree monad that have access to `m` and `c` so we can do whatever the heck we want. This is where the power of monads really shows up.

One thing to keep at the forefront of your thinking is that `c` is a function that builds the rest of the tree from the present node down to the leaves. While `m` is just whatever value we want to pass through as our state. But if `c` is a function, what is it’s argument? Since it is a continuation, it must take whatever value we would ordinarily return from whatever function we’re in. But since our tree monad also has the state context, we need to pass that on as well. As a result, `c` is a function that takes a vector of our current return value and the current state.

``````(c [value state])
``````

And it will proceed to build out the rest of the tree.

Given that, how do we actually build the tree for a given logic program?

At the top level, a logic program is created by a call to `defno`. And `defno` returns a value in our tree monad. So given something like

``````(defno z [q]
(== q 10))
``````

We could build our tree with

``````(first ((z beginning-state) identity))
``````

We’ll discuss which value to use for `beginning-state` in a bit. But calling `z` with it returns a function that expects to be called with a continuation. By passing `identity` as our continuation, we’re indicating we just want to get whatever the final result is. And since, for us, continuations always accept a vector, that’s what is going to be returned to us. But we’re not interested in what final state value was, so we use `first` to get what we’re interested in, which is a logic arrow value. All this follows pretty mechanically from the definition of our tree monad

``````(def tree-m (m/state-t m/cont))
``````

Since we’re going to be building trees in several different places, we’ll make function to do that for us.

``````(defn build-tree
([tree-fn] (build-tree tree-fn #{}))
([tree-fn m] (build-tree tree-fn m identity))
([tree-fn m c] ((tree-fn m) c)))
``````

If we’re given just a tree building function, we’re going to use an empty hash-set as our initial state value.

Remember that in building our tree, we want to keep track of ‘active’ functions. That is, functions that are called, but whose continations haven’t been called yet. Whether a function is active or not is a binary option, so a function will only appear in our state at most once. And we’ll need to check whether a given function is in our state, so a set is the natural choice.

### Glue

Given our tree monad, `fail` and `==`, we need a way to compose them to build logic programs. minKanren/core.logic provides a set of operators to that.

First off, we want to combine goals sequentially.

``````(defn all [& steps]
(m/bind (first steps)
(m/chain (map (fn [step]
(fn [p]
(tree-cc
(fn [m c]
(let [[step-p new-m] (build-tree step m c)]
[(a/seq p step-p) new-m])))))
(rest steps)))))
``````

Oh boy. That looks complicated, so let’s break it up. We know that `all` needs to accept a set of goals or `steps`. These goals are values in our tree monad and `all` needs to return a `tree-m` value as well. Also, we want to compose these goals sequentially and that is done using `m/bind` which always returns a monadic value.

While the first parameter to `m/bind` is a monadic value, the second is a function that takes a logic arrow value (since those are the only kinds of values we’ll be wrapping with our tree monad) and returns a monadic value. So all of the goals except the first one have to somehow be scrunched into a single function.

`m/chain` will do what we want because it takes a list of functions that each take a plain value and return a monadic value and produces a function that has the effect of chaining them together sequentially.

So we have to create such a sequence out of the rest of the `steps` and the call to `map` does this. For reference, here’s the function we’re mapping with.

``````(fn [step]
(fn [p]
(tree-cc
(fn [m c]
(let [step-p (first (build-tree step m c))]
[(a/seq p step-p) nil])))))
``````

This function has to take a step and return a function. That returned function has to take a logic arrow value `p` and return a tree monadic value. Hence the call to `tree-cc`.

`tree-cc` takes a function that accepts a state `m` and continuation `c`. And now we finally come to heart of the matter.

We have a logic value `p` which are all the preceding steps composed sequentially. And we have a continuation with which to build a logic value that is the rest of the tree from this point on. And we have the current `step`. So we build the `step-p` which is the current step plus the rest of the tree, using the current state `m` and the continuation. Then we use the sequential arrow combinator `a/seq` to compose those two arrow values and return that value as the first element of a pair.

### Branches

The next major core.logic combinator is `conde` for specifying alternatives, or branches, in our search tree

``````(defn conde [& clauses]
(let [clauses (map #(apply all %) clauses)]
(tree-cc
(fn [m c]
(let [clauses-p (->> clauses
(map #(build-tree % m c))
(map first)
(apply a/all))]
[clauses-p nil])))))
``````

It’s a little shorter. `conde` accepts a list of clauses where each clause is a list of goals. Like `all` it needs to return a logic arrow value wrapped in the tree monad and so it uses `tree-cc`.

The first thing we need to do is combine each `clause` into a single goal, which we can do immediately with `all` we just defined. Then we can call `tree-cc` with a function that closes over `clauses` which waits until we’re given a state `m` and continuation `c` at tree building time.

When that happens, we build the sub tree corresponding to each branch of the `conde` using the same state and continuation for each. Since each call to `build-tree` returns a pair, we use `first` to extract the logic arrow values.

And finally, we use the arrow combinator `a/all` to compose the clause arrow values into a single logic arrow value, which we return in a pair. That final arrow value will see to it that each branch of the search tree is traversed and all the answers collected for a final answer. Magic!

`fresh` is not all that interesting, so I’ll just include it for completeness

``````(defmacro fresh [[& lvars] & goals]
`(let [~@(lvar-binds lvars)]
(clojure.core.logic/all ~@goals)))
``````

### The final piece

And the final bit of magic is building these logic functions that return goals using `defno`. Buckle up.

Since this is just a proof of concept, instead of doing the right thing when we hit a recursive call, we’re going to punt and do nothing since all we really care about is the meta data on the arrow. So every recursive logic arrow value is going to be:

``````(defn recursive-p [sym]
(logic-arr. (constantly logic-zero)
{:recursive #{sym}}))
``````

And finally, `defno` itself:

``````(defn embed-tree [sym expr m c]
(let [[p new-m] (build-tree expr
(conj m sym)
identity)]
(if (and (= (get (meta p) :results 0) 0)
(= (get (meta p) :recursive) #{sym}))
[fail-p m]
(c [p (disj new-m sym)]))))

(defn immediate-recursive [sym]
(constantly (tree-cc
(fn [m c]
[(recursive-p sym) m]))))

(defmacro defno [sym args expr]
`(defn ~sym ~args
(let [~sym (clojure.core.logic/immediate-recursive '~sym)]
(clojure.core.logic/tree-cc
(fn [m# c#]
(if (contains? m# '~sym)
[(clojure.core.logic/recursive-p '~sym) m#]
(clojure.core.logic/embed-tree '~sym ~expr m# c#)))))))
``````

Starting with `defno`, it has to be a macro because it’s going to be generating definitions in the dictionary. It takes a symbol `sym` for the function to define, the `args` to the function. And finally, the goal expression `expr`.

The only real trick here is to define a value for `sym` inside the function definition for `sym` so that any immediately recursive call to `sym` in `expr` will not recur but will instead return value from `immediate-recursive`. And that value is just a tree monadic value that terminates the tree building process at that point, returning the `recursive-p` logic arrow value.

With that taken care of, we define a function for `sym` that uses `tree-cc` to return a tree monadic value anytime it is called. At tree building time, when given `m#` and `c#`, we check to see if `sym` is in `m#` indicating that this a recursive call to `sym` and if so, terminating the building.

But if it’s not a recursive call, then we need to build the tree from this point on using `m#` and `c#` which is done in `embed-tree`.

`embed-tree` adds `sym` to the state `m` signifying that this function is now ‘active’. And then it builds the tree for `expr` using the updated state.

If the resulting logic arrow value doesn’t return any results and it only has recursive calls to `sym`, then we know that there is no point searching that part of the tree because it will never produce any results.

But if the tree fragment for `expr` (that is `p`) can produce results, we take the resulting state `new-m`, remove `sym` from it and pass it along with `p` to the continuation `c` to build the rest of the tree.

And we’re done. Finally.

### Coda

Since this is just a proof of concept, there are so many things that could be done.

You could add tabling to goals that are called recursively, and only those goals. That would eliminate a class of divergence with out incurring the penalty of tabling every goal.

If substitutions are combinable, then tree fragments that have no recursion could be reduced down to a list of sub-substitutions at compile time. When the search hits that point, instead of having to walk the tree, you just combine each of the sub-substitutions with the current one to see if you get any results.

Also, any paths from the root to a leaf that contains mutually exclusive substitutions could be eliminated at compile time.

Or you could convert every path from the root to each leaf in to a set of unifications. Then rearrange them so that common prefixes of unifications could be done just once, improving performance.

And there are probably other optimizations that are possible that I haven’t thought of.

What I most appreciate about monads and the other abstractions from Category Theory is that they let you isolate the various pieces of essential complexity into independent pieces. Here, the logic monad dealt with backtracking, the logic arrow dealt with composing expressions and static optimizations while the state and continuation monad each dealt with their piece of the building of the arrow values. It certainly wasn’t easy figuring all that out, but it was very satisfying to see all the pieces come together.

### Next time …

I’m going to go back to the basics with monoids.

24 October 2013