Functors

If you take a look at the archives, you’ll see that I started this blog explaining monads. Monads have the most buzz in functional programming circles, but there are plenty of other abstractions that are just as useful in their own ways. If I was starting this blog today, this is the first post I would write rather than just jumping straight to monads.

Collections

Every programming language I’m aware of has a set of primitive data types; functions, integers, floats, strings, etc. And you start learning a language by figuring out what you can do with the primitive types. But very quickly, you move to working with collections of these primitive types. Lists, arrays, sets and hash maps are examples of collection types you’ll find.

So given a collection of values, there are a couple of obvious things you can do with them. One is to take two collections of the same type and combine them together. In category theory, a collection type along with a function to combine two collections and a special collection value is called a monoid. That special collection value is such that when it is combined with any other collection value, the result is the other collection value unchanged. I’ll come back to monoids in the future, but I’ll note that they show up in the clojure.core.reducers namespace.

Functors

Another obvious operation on a collection is to transform a collection by applying a function to every item in that collection, producing a new collection of the results. This operation is commonly referred to as ‘mapping’. So a functor is a collection type (list, tree, etc.) and a function (typically called fmap) that knows how to apply a function to every item in the collection. This type of abstraction over collections is already part of Clojure in at least two ways. First is conj which does the right thing when you want to add a value to various collection types. And also map in clojure.core, which does exactly what fmap of a functor does in the case of any collection that is seqable. One issue with Clojure’s map is that it always returns a list, where fmap is defined to always return a collection of the same type as the original collection. I’ve created a minimal functor library for Clojure here. It just defines a protocol with an fmap function and gives a couple of simple example implementations. Because of the way Clojure’s protocols work, I switched the order of the parameters to fmap as compared to Clojure’s map function. Since functors are so general and Clojure already has map and clojure.core.reducers/map, I don’t intend to add to the functors library unless I come across some collection types that aren’t easily handled by the tools Clojure already provides.

To properly be called a functor, the fmap function must also obey certain laws:

(fmap coll identity) <=> coll
(fmap coll (comp g h)) <=> (fmap (fmap coll h) g)

The first expression can be interpreted to say that, for any functor, mapping the identity function over a collection just returns the collection itself.

The second expression can be interpreted to say that mapping a function ‘h’ over a collection and then mapping another function ‘g’ over the resulting collection is equivalent to composing ‘h’ and ‘g’ and then mapping that function over the original collection.

Applicative Functors

Functors are so general, that there’s not much use for a functor library in Clojure. But things get a little more interesting when you start looking at a subset of functors called applicative functors. Applicatives have a few more requirements put on them than functors and as a result are more useful. I’ve created an applicative library here.

Since applicatives are functors, they have a collection type and an fmap function that follows the functor laws. But they also have a couple of other functions that must be defined; pure and what I call map (in Haskell land named <*>). The applicative map can be distinguished from clojure.core.map by requiring the applicative namespace under an alias.

The pure function takes any value and puts it into the collection type. In Clojure, you can view list, vector and hash-set as implementations of pure. Since Clojure doesn’t have a type system like Haskell’s with its type inferencing, it’s not really possible to define a protocol containing a pure function, so I just extended the various collection types to implement a protocol that defines a map function. For new collection types that aren’t extension of Clojure’s collections, the type constructor will serve as the pure function.

The map function of the applicative takes a collection like fmap does. Where it differs is in the other parameter is accepts. Instead of a simple function, applicative’s map takes a collection of functions that are contained in the collection type of the applicative. For example:

(require [net.clojure.applicative :as a])
(a/map [fn1 fn2 fn3] [1 2 3 4])

In this case, the collection type is a vector and the result of the call to a/map would be a vector of the results returned by applying every function fn1, fn2 and fn3, to every value 1 2 3 4. So it would be a vector of 12 results. Notice that the order of args is different than that of fmap in that the list of functions comes first and then the parameters. This is again because of the way Clojure’s protocols work. They dispatch to the correct protocol function according to the type of the first parameter. With fmap, the first parameter couldn’t be the function, because the proper implementation of fmap couldn’t be selected. So the collection had to be first. In the case of a/map, the collection holding the functions to apply can be first and select the proper implementation to execute.

So why not just keep the same order as fmap? Aside from being the same parameter ordering as clojure.core/map, there’s also another reason. fmap only allows a function that accepts a single argument to be mapped over the collection. a/map will allow functions of any number of args to be mapped over a matching number of collections. So, for example:

(a/map (list +) [1 2 3] [1 2 3])
=> (2 3 4 3 4 5 4 5 6)

But you’re not limited to a list with a single function in it:

(a/map (list + vector) [1 2] [1 2])
=> (2 3 3 4 [1 1] [1 2] [2 1] [2 2])

And that leads to an interesting observation. There’s actually another valid way of interpreting two (or more) lists of parameters and that is with pairwise matching like the behavior of clojure.core/map when given multiple lists.

(map vector [1 2 3] [:a :b :c])
=> ([1 :a] [2 :b] [3 :c])

To account for this, I added a deftype to applicative called ZipWith along with a helper function called zipwith which takes either a single value or a sequence. If it’s given a sequence (vector, list or lazy seq), it just wraps it in the ZipWith type. But if you give it a single value, it wraps an infinite, lazy sequence of that value. And then the implementation of map combines all the sequences pairwise and hands that list off to r/map to apply them in parallel. This is one way to address Rich’s comment in this blog post about needing a way to pass multiple lists to r/map.

Here it is in action:

@(a/map (zipwith +) [1 2 3] (range 2000))
=> (1 3 5)

The dereference is necessary to unwrap the ZipWith collection returned from a/map. Also notice that only 3 results were returned because that was the length of the shortest sequence given as a parameter.

To Be Continued

That’s enough for one blog post. Next time, I intend to explore some reasons to use applicatives and their relation to monads.

Jim Duey 19 January 2013
blog comments powered by Disqus