In my work on the Lonocloud team at ViaSat, I recently had the occasion to use generative testing on an internal project. This project involved a combinatorial explosion of features, so it was impossible to get good coverage writing test cases by hand. I had a suite of test cases from an earlier version and the new implementation passed all of them, so I thought I was pretty close to being done. Then I started testing it with core.test.check.

Literally months later and more bugs fixed than I can count, I’m totally sold on generative testing (aka property based testing). However, that’s not the whole story.

In testing this code base, the normal generators supplied by test.check were insufficient. I had to come up with an extension to allow me to do what I needed to do. This write-up explains what I did.

The code snippets are intended to be pasted into a REPL that has my effects library as a dependency. I’ve not polished that library too much yet, so I’ll leave you to your own devices on getting the repo cloned and installed.

For our motivating example, we’ll take a simple tree datatype.

```
(deftype Tree [children])
(defn tree [& children]
(Tree. children))
(defmethod print-method Tree [t ^java.io.Writer w]
(.write w (str "<Tree " (.children t) ">")))
```

A tree is a value that has a number of children which are also trees or plain values (in this case, keywords) at the leaves.

We’ll also define a protocol for a tree function

```
(defprotocol TreeFns
(count-leaves [_ v]))
```

`count-leaves`

walks the tree to count the number of leaves that are equal to `v`

. We’ll implement those in a second.

We’re going to write some test.check generators to test our implementations of `count-leaves`

. But first a small digression. In test.check, generators are monads. That is, they define `return`

and `bind`

functions that satisfy the 3 monad laws. In solving my problem, I found test.check to be missing some functionality, so I extended the `Generator`

record type to implement the `wrap`

, `fapply*`

and `flat-map`

protocol functions from my effects library. These are my names for `return`

, `fapply`

, and `bind`

, respectively. You’ll have to clone this repo since I haven’t put it on Clojars yet.

```
(require '[effects :as e])
(require '[clojure.test :refer :all])
(require '[clojure.test.check :refer :all])
(require '[clojure.test.check.generators :as gen])
(require '[clojure.test.check.properties :as props])
(require '[clojure.test.check.clojure-test :refer :all])
(import '[clojure.test.check.generators Generator])
(extend-type Generator
e/Applicative
(e/wrap [_ v]
(gen/return v))
(e/fapply* [gf gs]
(gen/bind (apply gen/tuple gs)
(fn [args]
(gen/fmap (fn [f] (apply f args)) gf))))
e/Monad
(e/flat-map [gen f]
(gen/bind gen f)))
```

Now, on to generating trees.

```
(defn make-tree [max-height]
(e/for [child-count (gen/choose 0 3)
tree (if (or (< max-height 1) (= child-count 0))
(gen/elements [:a :b :c :d :e])
(let [child-generators (repeat child-count (make-tree (dec max-height)))]
(e/for [children (e/fapply* (gen/return list) child-generators)]
(Tree. children))))]
tree))
```

The `e/for`

macro implements general monad comprehension rather than just list comprehension like Clojure’s standard `for`

macro. So `child-count`

is the number chosen by `(gen/choose 0 3)`

This defines a generator that produces trees by recursively generating a number of sub-trees as the children of a tree. The number of sub-trees can be 0, 1, or 2. If 0 sub-trees are to be generated, or the tree has reached the maximum desired height, make a leaf (holding a keyword value) instead.

And here’s a spec to use that generator to test `count-leaves`

. For illustrative purposes, I chose to print the number of each leaf value in a hash-map rather than actually test anything.

```
(defspec test-count-leaves 20
(props/for-all [tree (e/for [max-height (gen/choose 2 10)
tree (make-tree max-height)]
tree)]
(do
(println (into {} (->> [:a :b :c :d :e]
(map #(vector % (count-leaves tree %)))
(filter (fn [[leaf count]]
(< 0 count))))))
true)))
```

I’ll break this down for those not familiar with Clojure or test.check. `defspec`

is is how test.check specifies a test that tests a function a fixed number of times (in this case, 20) using a property. A property uses a generator(s) to make input values for the function being tested. So the `e/for`

creates the generator and `props/for-all`

uses it to make multiple values for `tree`

that get passed to `count-leaves`

. `props/for-all`

is sort of like monad comprehension, but not quite.

So let’s implement `count-leaves`

and test it.

```
(extend-type clojure.lang.Keyword
TreeFns
(count-leaves [t v]
1))
(extend-type Tree
TreeFns
(count-leaves [t v]
(->> (.children t)
(map #(count-leaves % v))
(reduce +))))
(test-count-leaves)
```

Hopefully, some hash-maps were printed showing the count of the leaf values in the generated trees.

But how do we know if the hash-map being printed actually has the correct counts?

The standard property based testing answer is to implement a simpler version of the function under test and compare the results. Except our implementation is already pretty simple. I suppose we could implement a breadth-first search instead of depth-first like above.

Another option would be to determine the respective count values beforehand. In my original problem, that wasn’t an option. The trees had to be randomly generated.

So, I came up with a way to keep track of the count of each leaf value as the tree was being generated and a hash-map with the final counts was returned along with each generated tree. But it required a little monad magic.

I did this by combining the generator monad above with the State monad. This allows a state value (a hash-map) to updated with the leaf counts as the tree was being generated. Here it is in all it’s monadic glory

```
(deftype StateGen [invoke-fn]
clojure.lang.IFn
(invoke [_ s]
(invoke-fn s))
(applyTo [_ [v]]
(invoke-fn v))
e/Applicative
(e/wrap [_ v]
(StateGen. (fn [s]
(gen/return (list v s)))))
(e/fapply* [wrapped-f args]
(StateGen. (fn [s]
((e/flat-map (StateGen. (fn [s]
(gen/return [(fn [wrapped-f & args]
(apply wrapped-f args))
s])))
(fn [s]
(e/comprehend s (cons wrapped-f args))))
s))))
e/Monad
(e/flat-map [ev f]
(StateGen. (fn [s]
(gen/bind (ev s)
(fn [[v ss]]
((f v) ss)))))))
;; create a stateful generator that always returns 'x'
;; and leaves the state value 's' unchanged
(defn state-gen [x]
(StateGen.
(fn [s]
(gen/return [x s]))))
;; convert a plain generator into a stateful generator
(defn liftGen [g]
(StateGen.
(fn [s]
(gen/fmap #(vector % s) g))))
```

She’s a beaut, isn’t she?

I’ll spare you a deep dive into the details of how it works, but feel free to hit me up if you’re curious. Instead, I’ll just show you it works and how to use it.

```
(println (state-gen 8))
;; some indecipherable text get's printed
```

Well that didn’t help much. But notice that the StateGen type has an implementation for `invoke`

so that it can be called as a function.

```
(println ((state-gen 8) :state-value))
```

Now we’re getting somewhere. (That should have printed the text representation of a generator) So when a StateGen function is called with a state value, it returns a generator. Let’s put that in a spec and generate a value.

```
;; How to create and use the simplest state-gen value
(deftest simplest
(quick-check 1
(props/for-all [[value state-value] ((state-gen 8) :state-value)]
(and (= value 8)
(= state-value :state-value)))))
(simplest)
```

There’s a bit going on in that snippet. `(state-gen 8)`

returns a function that gets called with `:state-value`

. The result of that call is a generator. `props/for-all`

uses that generator to produce a vector that contains a `value`

and a `state-value`

which are equal to `8`

and `:state-value`

respectively. `quick-check 1`

runs the property one time.

`props/for-all`

and `quick-check`

are from test.check.

Now that we’ve seen how to use a stateful generator, let’s take a look at generating trees using them.

```
(defn make-stateful-tree [max-height]
(e/for [child-count (liftGen (gen/choose 0 3))
tree (if (or (< max-height 1) (= child-count 0))
(liftGen (gen/elements [:a :b :c :d :e]))
(let [child-generators (repeat child-count (make-stateful-tree (dec max-height)))]
(e/for [children (e/fapply* (state-gen list) child-generators)]
(Tree. children))))]
tree))
```

This looks almost the same as `make-tree`

. The only apparent difference is that the plain generators from the `gen`

namespace are wrapped using `liftGen`

. This converts them to stateful generators.

And how is this used?

```
(defspec test-stateful-count-leaves 20
(props/for-all [[tree state] ((e/for [max-height (liftGen (gen/choose 2 10))
tree (make-stateful-tree max-height)]
tree) {})]
(do
(println (into {} (->> [:a :b :c :d :e]
(map #(vector % (count-leaves tree %)))
(filter (fn [[leaf count]]
(< 0 count))))))
true)))
(test-stateful-count-leaves)
```

Again, it’s almost exactly what we had before with the addition of calling the stateful tree generator with a state value of `{}`

. And then destructuring the result into the `tree`

and `state`

values. Now let’s turn it up a notch. Take a look at `make-stateful-tree`

. What is not obvious is that the state value is being passed around behind the scenes. The stateful generators produced by `liftGen`

can’t make use of this value. They don’t have anyway to access it and just pass it through unchanged. To use this value, we need a couple of additional functions.

```
;; get a value associated with 'key' from the state value 's'
(defn get-sval [key & [default]]
(StateGen.
(fn [s]
(gen/return [(get s key default) s]))))
;; associate the 'val' with 'key' in the state value
;; returns the value (if there was one) that was associated with 'key'
;; in the state value 's'
(defn set-sval [key val]
(StateGen.
(fn [s]
(gen/return [(get s key) (assoc s key val)]))))
```

These two functions allow you to read and write values to/from a state value if it’s a hash-map. So using them, let’s update `make-stateful-tree`

to keep track of the leaf value counts as the tree is being generated.

```
(defn make-stateful-tree [max-height]
(e/for [child-count (liftGen (gen/choose 0 3))
tree (if (or (< max-height 1) (= child-count 0))
(e/for [leaf (liftGen (gen/elements [:a :b :c :d :e]))
curr-count (get-sval leaf 0)
_ (set-sval leaf (inc curr-count))]
leaf)
(let [child-generators (repeat child-count (make-stateful-tree (dec max-height)))]
(e/for [children (e/fapply* (state-gen list) child-generators)]
(Tree. children))))]
tree))
```

What we added was getting the `curr-count`

for the generated `leaf`

value and then writing the incremented value back to the state. Now, as each leaf is generated for the new tree, the corresponding count in the state value hash-map is incremented. Here’s how to use that state value.

```
(defspec test-stateful-count-leaves 20
(props/for-all [[tree state] ((e/for [max-height (liftGen (gen/choose 2 10))
tree (make-stateful-tree max-height)]
tree) {})]
(do
(println :leaf-counts state)
(= state (into {} (->> [:a :b :c :d :e]
(map #(vector % (count-leaves tree %)))
(filter (fn [[leaf count]]
(< 0 count)))))))))
(test-stateful-count-leaves)
```

Wait! What?! It seems there’s a bug in our implementation of `count-leaves`

. And indeed, if you didn’t catch it earlier, when we extended the Keyword type to implement `count-leaves`

, we ignored the parameter `v`

and just always returned 1. Here’s the correct version.

```
(extend-type clojure.lang.Keyword
TreeFns
(count-leaves [t v]
(if (= t v)
1
0)))
(test-stateful-count-leaves)
```

When we pass an empty hash-map in as the initial state, we get a hash-map that contains the leaf counts back as the value in `state`

. Then, we can compare that to the hash-map generated using `count-leaves`

. And voila’, we’re done.

Now you’re probably asking “What’s the point of all this when I could just use an atom.” (Thanks to Joel Martin for the idea) And you’d be right. This is the hard way of doing this if that was all you wanted. The original problem was much more complex, so this solution allowed a bunch of that complexity to be wrapped in an abstraction and hidden from view. It’s not justified in simpler cases. Another benefit is that there are no issues if the various instances of the test are run in parallel, which is nice.

One problem with this is shrinking the test data. test.check binds the implementation of generators to an internal rose-tree monad. Implementing a custom shrinking method would basically entail re-implementing test.check and I didn’t have that kind of time. I did seriously think about it, though.

There’s another capability we have with the StateGen monad. We can use the information in the state value accumulated from earlier in the generation process to determine how we want to generate the rest of the tree. For example;

```
(defn make-stateful-tree [max-height]
(e/for [child-count (liftGen (gen/choose 0 3))
tree (if (or (< max-height 1) (= child-count 0))
(e/for [max-a (get-sval :max-a 0)
curr-a (get-sval :a 0)
leaf (if (< curr-a max-a)
(liftGen (gen/elements [:a :b :c :d :e]))
(liftGen (gen/elements [:b :c :d :e])))
curr-count (get-sval leaf 0)
_ (set-sval leaf (inc curr-count))]
leaf)
(e/for [children (e/fapply* (state-gen list)
(repeat child-count (make-stateful-tree (dec max-height))))]
(Tree. children)))]
tree))
(defspec test-stateful-count-leaves 20
(props/for-all [[tree state] ((e/for [max-height (liftGen (gen/choose 2 10))
tree (make-stateful-tree max-height)]
tree) {:max-a 4})]
(let [state (dissoc state :max-a)]
(println :leaf-counts state)
(= state (into {} (->> [:a :b :c :d :e]
(map #(vector % (count-leaves tree %)))
(filter (fn [[leaf count]]
(< 0 count)))))))))
(test-stateful-count-leaves)
```

Now in `make-stateful-tree`

, we read the value of `:max-a`

and `:a`

, and as long as the count of `:a`

’s is less than `max-a`

, we proceed as usual. But if not, we drop `:a`

from the possible leaf values to be generated. So all of the tree’s generated should have no more than 4 leaves of value `:a`

.

And that’s how you do it!

Jim Duey 11 September 2015