If you take a look at the archives, you’ll see that I started this blog explaining monads. Monads have the most buzz in functional programming circles, but there are plenty of other abstractions that are just as useful in their own ways. If I was starting this blog today, this is the first post I would write rather than just jumping straight to monads.
Every programming language I’m aware of has a set of primitive data types; functions, integers, floats, strings, etc. And you start learning a language by figuring out what you can do with the primitive types. But very quickly, you move to working with collections of these primitive types. Lists, arrays, sets and hash maps are examples of collection types you’ll find.
So given a collection of values, there are a couple of obvious things you can do with them. One is to take two collections of the same type and combine them together. In category theory, a collection type along with a function to combine two collections and a special collection value is called a monoid. That special collection value is such that when it is combined with any other collection value, the result is the other collection value unchanged. I’ll come back to monoids in the future, but I’ll note that they show up in the clojure.core.reducers
namespace.
Another obvious operation on a collection is to transform a collection by applying a function to every item in that collection, producing a new collection of the results. This operation is commonly referred to as ‘mapping’. So a functor is a collection type (list, tree, etc.) and a function (typically called fmap
) that knows how to apply a function to every item in the collection. This type of abstraction over collections is already part of Clojure in at least two ways. First is conj
which does the right thing when you want to add a value to various collection types. And also map
in clojure.core, which does exactly what fmap
of a functor does in the case of any collection that is seqable. One issue with Clojure’s map
is that it always returns a list, where fmap
is defined to always return a collection of the same type as the original collection. I’ve created a minimal functor library for Clojure here. It just defines a protocol with an fmap
function and gives a couple of simple example implementations. Because of the way Clojure’s protocols work, I switched the order of the parameters to fmap
as compared to Clojure’s map
function. Since functors are so general and Clojure already has map
and clojure.core.reducers/map
, I don’t intend to add to the functors library unless I come across some collection types that aren’t easily handled by the tools Clojure already provides.
To properly be called a functor, the fmap
function must also obey certain laws:
(fmap coll identity) <=> coll
(fmap coll (comp g h)) <=> (fmap (fmap coll h) g)
The first expression can be interpreted to say that, for any functor, mapping the identity function over a collection just returns the collection itself.
The second expression can be interpreted to say that mapping a function ‘h’ over a collection and then mapping another function ‘g’ over the resulting collection is equivalent to composing ‘h’ and ‘g’ and then mapping that function over the original collection.
Functors are so general, that there’s not much use for a functor library in Clojure. But things get a little more interesting when you start looking at a subset of functors called applicative functors. Applicatives have a few more requirements put on them than functors and as a result are more useful. I’ve created an applicative library here.
Since applicatives are functors, they have a collection type and an fmap
function that follows the functor laws. But they also have a couple of other functions that must be defined; pure
and what I call map
(in Haskell land named <*>
). The applicative map
can be distinguished from clojure.core.map
by requiring the applicative namespace under an alias.
The pure
function takes any value and puts it into the collection type. In Clojure, you can view list
, vector
and hash-set
as implementations of pure
. Since Clojure doesn’t have a type system like Haskell’s with its type inferencing, it’s not really possible to define a protocol containing a pure
function, so I just extended the various collection types to implement a protocol that defines a map
function. For new collection types that aren’t extension of Clojure’s collections, the type constructor will serve as the pure
function.
The map
function of the applicative takes a collection like fmap
does. Where it differs is in the other parameter is accepts. Instead of a simple function, applicative’s map
takes a collection of functions that are contained in the collection type of the applicative. For example:
(require [net.clojure.applicative :as a])
(a/map [fn1 fn2 fn3] [1 2 3 4])
In this case, the collection type is a vector and the result of the call to a/map would be a vector of the results returned by applying every function fn1, fn2 and fn3, to every value 1 2 3 4. So it would be a vector of 12 results. Notice that the order of args is different than that of fmap
in that the list of functions comes first and then the parameters. This is again because of the way Clojure’s protocols work. They dispatch to the correct protocol function according to the type of the first parameter. With fmap
, the first parameter couldn’t be the function, because the proper implementation of fmap
couldn’t be selected. So the collection had to be first. In the case of a/map
, the collection holding the functions to apply can be first and select the proper implementation to execute.
So why not just keep the same order as fmap
? Aside from being the same parameter ordering as clojure.core/map
, there’s also another reason. fmap
only allows a function that accepts a single argument to be mapped over the collection. a/map
will allow functions of any number of args to be mapped over a matching number of collections. So, for example:
(a/map (list +) [1 2 3] [1 2 3])
=> (2 3 4 3 4 5 4 5 6)
But you’re not limited to a list with a single function in it:
(a/map (list + vector) [1 2] [1 2])
=> (2 3 3 4 [1 1] [1 2] [2 1] [2 2])
And that leads to an interesting observation. There’s actually another valid way of interpreting two (or more) lists of parameters and that is with pairwise matching like the behavior of clojure.core/map
when given multiple lists.
(map vector [1 2 3] [:a :b :c])
=> ([1 :a] [2 :b] [3 :c])
To account for this, I added a deftype to applicative called ZipWith along with a helper function called zipwith
which takes either a single value or a sequence. If it’s given a sequence (vector, list or lazy seq), it just wraps it in the ZipWith type. But if you give it a single value, it wraps an infinite, lazy sequence of that value. And then the implementation of map
combines all the sequences pairwise and hands that list off to r/map
to apply them in parallel. This is one way to address Rich’s comment in this blog post about needing a way to pass multiple lists to r/map
.
Here it is in action:
@(a/map (zipwith +) [1 2 3] (range 2000))
=> (1 3 5)
The dereference is necessary to unwrap the ZipWith collection returned from a/map. Also notice that only 3 results were returned because that was the length of the shortest sequence given as a parameter.
That’s enough for one blog post. Next time, I intend to explore some reasons to use applicatives and their relation to monads.
Jim Duey 19 January 2013