Meta Machinations

Well, after a little detour I’m back to talking about arrows. We’d just seen how to implement the sequential and parallel logic arrow combinators before I got distracted. So if we have combinators, we need things to combine.

Goals

In core.logic, the basic building blocks are called ‘goals’. Practically speaking, goals are just another name for monadic functions or arrow values. They’re just functions that take a substitution and return a monadic value in the logic monad.

There are three basic goals that everything else is built from. The first is fail:

(def fail (logic-arr. (constantly logic-zero)
                      {:fail true}))

This is a simplified version that we’ll add to later. But there are two things to see. The functional portion of fail is just a function that always returns the zero value for the logic monad. This signifies that no solutions are possible and that the search should end here. We also see the first bit of meta data for our logic arrow. The only way you could tell that the functional portion was failure is to execute it and see what the result was. By adding a :fail key with a true value to the meta data, we can determine whether any arrow value is a fail goal or not with out having to execute it. This will come in handy later.

For completeness, the succeed goal is:

(def succeed (logic-arr. logic-m {}))

But it’s not very useful, so I’ll not spend any time on it.

The final goal is the one that does all the work. It’s created by the unification operator:

(defn == [u v]
  (logic-arr. (fn [a]
                (if-let [ap (unify a u v)]
                  (logic-m ap)
                  logic-zero))
              {}))

The == operator takes two values and produces a unification goal. The goal tries to establish an equality relationship between the two values in a given substitution (the parameter ap). If the relationship is possible, the new substitution is returned wrapped in a logic monad value. If the relationship is not possible, the logic zero value is returned and the search stops for this branch.

Getting interesting

So we’ve added our first piece of meta data to an arrow value. How can we make use of that? Remember that we defined sequential arrow composition as:

  (arrow-seq [p ps]
    (logic-arr. (m/chain (cons p ps))
                {}))

Now, if any of the arrow values passed into arrow-seq is a failure goal, the whole sequence can be replaced with a failure goal. Which leads us to something like this:

(arrow-seq [p ps]
  (let [ps (cons p ps)]
    (if (some #(get (meta %) :fail) ps)
      (logic-arr. (constantly logic-zero) {:fail true})
      (logic-arr. (m/chain ps) {}))))

More complicated, but we’ve eliminated a bunch of unnecessary computation if we see a failure in a sequence of arrow values. And if this sequence is embedded in a larger sequence, that one also gets collapsed and so forth.

And what about parallel branches? arrow-par was defined like this:

  (arrow-par [p ps]
    (logic-arr. (fn [ss]
                  (m/plus (map #(%1 %2) ps ss)))
                {}))

So we’re given a collection of arrow values that we’re going to search down in parallel. But we know that any of them that are failure aren’t going to return a result so we don’t have to go down them. Even better, if all of them are failures, we can just return a failure goal as the result. Like this:

(arrow-par [p ps]
    (let [ps (cons p ps)
          ps (remove #(get (meta %) :fail) ps)]
      (if (empty? ps)
        (logic-arr. (constantly logic-zero)
                    {:fail true})
        (logic-arr. (fn [ss]
                      (m/plus (map #(%1 %2) ps ss)))
                    {}))))

More interesting

Since adding :fail to the meta data let us do some interesting, what else could we add? What if we could somehow count the number of results a particular arrow value could return? For instance, fail and == would look like this:

(def fail (logic-arr. (constantly logic-zero) {:results 0}))

(defn == [u v]
  (logic-arr. (fn [a]
                (if-let [ap (unify a u v)]
                  (logic-m ap)
                  logic-zero))
              {:results 1}))

Even though == might return no results if unify returns nil, we can’t know that at until the function is run, so we indicate it will return one result.

And how do we use this meta data in our combinators. Parallel is easy, just add up the number of results from each of the branches and you’re good. Also, remove any branches that don’t return any results as an optimization.

(arrow-par [p ps]
    (let [ps (remove #(= 0 (get (meta %) :results)) (cons p ps))]
      (if (empty? ps)
        (logic-arr. (constantly logic-zero) {:results 0})
        (logic-arr. (fn [ss]
                      (m/plus (map #(%1 %2) ps ss)))
                    {:results (->> ps
                                   (map #(get (meta %) :results))
                                   (reduce #(+ %1 %2)))}))))

Sequential composition requires a little explanation. Say you have two arrow values. The first alone could possibly return 2 results while the second alone could return 3. The way the composed arrow value would work is, an empty substitution would be passed into the first arrow value as its parameter. That would produce (possibly) two results. Each of those results would then become an input to the second arrow value. And that arrow value could produce up to three results for each input it was called with. So the maximum number of results the composed arrow value could produce would be 2 * 3 = 6. Also, if either of the arrow values was a failure, the composed arrow value would produce 0 results. With that in mind, our sequential combinator looks like this:

(arrow-seq [p ps]
  (let [ps (cons p ps)
        results (apply * (map #(get (meta %) :results) ps))]
    (if (= results 0)
      (logic-arr. (constantly logic-zero)
                  {:results 0})
      (logic-arr. (m/chain ps)
                  {:results results}))))

Even more interesting

Now what would be the most useful piece of meta data we could add to our arrow? That is essentially the question that set me off on this whole tangent in the first place.

The promise of logic programming is that can declare the rules the solution to your problem should satisfy and the computer should search all the possible solutions to find the ones you want. Ideally and conceptually, the order you write those rules shouldn’t matter. But practically it does because of divergence.

Divergence is when the search space (or tree in our case) is infinite and the search goes down these infinite paths when it could have gone down some finite paths first and given you some useful answers. And infinite search trees are created by recursion.

So the most useful piece of meta data we could have about an arrow value would be if it could possibly generate an infinite search tree because it contained a recursive call in it somewhere. But just because an arrow value contains a recursive call, doesn’t mean it’s guaranteed to be infinite. You can’t know that until you run it. The best you can do is be on you guard when you’re dealing with recursive arrow values.

Divergence comes in several flavors depending on type of recursion present. There are techniques to deal with it depending on the particular kind you’re dealing with. One of the simplest is to put any recursive arrow values at the end of a sequence so that the non-recursive values have a chance to fail and prevent the search from ever going into an infinite loop.

I’ll explain the magic of how to find recursion in logic programs later, but assuming we could and marked it with a :recursive meta tag, our sequential combinator would look like this:

(arrow-seq [p ps]
  (let [ps (cons p ps)
        ps (sort-by meta
                    (comparator (fn [p-meta _]
                                  (not (contains? p-meta :recursive))))
                    ps)
        recursive (apply set/union (map #(get (meta %) :recursive) ps))
        results (apply * (map #(get (meta %) :results) ps))]
    (if (= results 0)
      (logic-arr. (constantly logic-zero)
                  {:results 0})
      (logic-arr. (m/chain ps)
                  {:results results
                   :recursive recursive}))))

Notice that the new :recursive value is derived by doing a union of all the component :recursive values. That means that the :recursive meta data is a set, which will enable some magic optimizations later. But for now, the only thing we care about is if the meta data of each arrow value contains :recursive or not and if so, put them at the end of the sequence.

The only change we need to make to our parallel combinator is to compute the combined :recursive value. Which is easily done:

(arrow-par [p ps]
  (let [ps (remove #(= 0 (get (meta %) :results)) (cons p ps))
        recursive (apply set/union (map #(get (meta %) :recursive) ps))]
      (if (empty? ps)
        (logic-arr. (constantly logic-zero) {:results 0})
        (logic-arr. (fn [ss]
                      (m/plus (map #(%1 %2) ps ss)))
                    {:recursive recursive
                     :results (->> ps
                                   (map #(get (meta %) :results))
                                   (reduce #(+ %1 %2)))}))))

Wrapping up

And that’s about all we need as far as our logic arrow goes. But we still are a long ways from having anything close to a workable logic programming system. We don’t have the basic core.logic operators defined in terms of these arrow combinators. The other forms of divergence haven’t been addressed. And we don’t know how compute the :recursive meta data, which was actually the most interesting part of this whole project. It really taxed my capabilities with monads and was a blast to figure out.

But as far as our arrow goes, we’re pretty much done. I do want to make a couple of points about arrows. When you need them, they’re hugely useful and let you do some amazing things at the cost of slightly more cumbersome notation. But an amazing number of problems map well to arrow solutions. For instance, anything where data flows through various stages maps directly to the arrow combinators. Stream processing like Storm, simulating electrical circuits, audio processing pipelines. Also, Flow-Based-Programming like noflojs is arrow based. And arrows can be used to eliminate callback hell in Javascript.

When I worked at Sonian, we started to go through the book Enterprise Integration Patterns. I was quite amused to see this thick book basically describe arrows and realize it could all be boiled down to a few pages.

So arrows have been a hugely beneficial abstraction to learn about and work with and they’ve made some seemingly impossible problems actually solvable to my great satisfaction. Hopefully this one example has generated some interest in you to dig into them.

Next time, I’m going to talk about the magic involved in building a logic programming system using this arrow and suggest some additional optimizations

Jim Duey 20 October 2013
blog comments powered by Disqus