What is a Monad, Really?

Ever since I started learning mathematics I wondered what the actual fuck is a monad. I’ve been programming since high-school, and I started learning Haskell about six months before my first semester at university. Learning Haskell was quite a paradigm shift from the more traditional, imperative languages I was used too, but I eventually able to understand it enough that I could be productive with it.

I learned how to work with typical monads in Haskell, such as [a] and Maybe a, yet I wasn’t able to understand what that all meant in mathematical terms. Being a student of pure mathematics, that bothered me for a while. Well, I’m happy to say I was finally able to understand it. I’ll try to summarize my knowledge in this article, in the hopes that it helps other people understand what in God’s green earth is a Monad.

This is a sentimental journey trough Category Theory. It’s mostly meant for programmers who already know how to work with monads and for mathematicians curious about the applications of Category Theory in the field of Computer Science. This is not a Haskell tutorial and you don’t need to understand any of this to be productive in Haskell!

The Formal Definition

I suppose you could look it up in Wikipedia, but I’ll write it down in here anyways. Monads are a very special type of endofunctors — functors from a category to itself. Let C\mathscr{\text{C}} be a category and let M:CCM : \mathscr{\text{C}} \rightarrow \mathscr{\text{C}} be a functor. Let η:1CM\eta : 1_{\mathscr{\text{C}}} \Rightarrow M and μ:M2M\mu : M^2 \Rightarrow M be natural transformations such that the following diagrams commute:

Commutative diagram
Commutative diagram
Commutative diagram
Commutative diagram

Then the triple (M,η,μ)\left ( M , \eta , \mu \right ) — often denoted by simply MM — is said to be a monad. Ok, but what does this all mean?

What is a Category, Really?

Category theory can be though-of as a generalization of Set Theory: it’s a uniform language for describing sets with a given structure — those are called objects — and functions that preserves structure between such sets — those are called morphisms or homomorphisms. A category is nothing more than that: a collection of sets with a given structure together with a collection of transformations between such sets.

This is a lie! The objects of a category need not to be sets, and the morphisms of a category need not to be functions. However, they are typically thought-of as so.

For example, the category k-Vectk \text{-} \text{Vect} of vector spaces over kk is the collection of all vector-spaces over kk, together with the collection of all linear transformations between such vector-spaces.

Note that the identity function idV:VVi d_V : V \rightarrow V is a linear transformation. Furthermore, if T:V1V2T : V_1 \rightarrow V_2 and S:V2V3S : V_2 \rightarrow V_3 are linear transformations, then the composition ST:V1V3S \circ T : V_1 \rightarrow V_3 is a linear transformation. This is a key concept in Category Theory! The identity function and the composition of two morphisms must be morphisms for a suitable collection to be called a category!

In Haskell there’s only one category we care about: the category of Haskell types, whose objects are Haskell types and whose morphisms are Haskell functions.

What is a Functor, Really?

A functor is a map between categories: it takes objects and morphisms as input and returns objects and morphism from another category. In other words, a functor F:CDF : \mathscr{C} \rightarrow \mathscr{D} is a map that associates an object F(X):DF ( X ) : \mathscr{D} for each object X:CX : \mathscr{C} and a marphism F(f):F(X)F(Y)F ( f ) : F ( X ) \rightarrow F ( Y ) for each morphism f:XYf : X \rightarrow Y in C\mathscr{C}. For FF to be a functor it must also preserve identity — i.e. F(1X)=1F(X)F ( 1_X ) = 1_{F ( X )} — and composition:

Commutative diagram in the domain of F
Commutative diagram in the domain of F
Corresponding commutative diagram in the codomain of F
Corresponding commutative diagram in the codomain of F

The idea behind this is that FF not only maps objects to objects — like a regular function or a morphism — it also gives us a consistent way to turn every morphism in C\mathscr{C} into a morphism in D\mathscr{D}.

This mechanism for converting morphisms in C\mathscr{C} to morphisms in D\mathscr{D} is sometimes referred-to as penetration, such as in Haskell’s standard library:

class Functor f where
  -- `fmap g fa` "penetrates" `g` in `fa`.
  -- A implementation of `fmap` is expected to
  -- respect the composition of functions, i.e.
  -- `fmap (h . g) == (fmap h) . (fmap g)`.
  fmap :: (a -> b) -> f a -> fb

Note that an instance of Functor is an endofunctor of the category of Haskell types.

Example

Consider the map P:SetSet\mathscr{P} : \mathbf{\text{Set}} \rightarrow \mathbf{\text{Set}} where P(X)\mathscr{P} ( X ) is the power set of XX and P(f)\mathscr{P} ( f ) — where f:XYf : X \rightarrow Y — is the function that takes a subset of XX to it’s image under ff.

Here Set\mathbf{\text{Set}} denotes the category of sets, where objects are sets and morphisms are regular functions. We not only know how to map a set to it’s power set, we also know to how turn every function between sets into a function between their corresponding power sets. This is the power set functor.

What is a Natural Transformation, Really?

Natural transformations are maps between functors: it takes objects as inputs and returns morphisms between the images of those objects under different functors. If F,G:CDF , G : \mathscr{C} \rightarrow \mathscr{D} are functors a natural transformation η:FG\eta : F \Rightarrow G from FF and GG takes X:CX : \mathscr{C} to a morphism ηX:F(X)G(X)\eta_X : F ( X ) \rightarrow G ( X ). For η\eta to be considered a natural transformation the following diagram must commute for each X,Y:CX , Y : \mathscr{C} and f:XYf : X \rightarrow Y:

The commutative diagram of a natural transformation
The commutative diagram of a natural transformation

The idea in here is that you have natural way to convert the action of FF to the action of GG. In other words, η:FG\eta : F \Rightarrow G is a morphism between FF and GG in the category of functors between C\mathscr{C} and D\mathscr{D} — where the objects are functors between C\mathscr{C} and D\mathscr{D}.

Example

Consider the maps η:1SetP\eta : 1_{\mathbf{\text{Set}}} \Rightarrow \mathscr{P} and μ:P2P\mu : {\mathscr{P}}^2 \Rightarrow \mathscr{P} given by ηX(x)={x}\eta_X ( x ) = \{ x \} and μX(S)=sSs\mu_X ( S ) = \cup_{s \in S} s, respectively. Here 1Set1_{\mathbf{\text{Set}}} denotes the identity functor on the category of sets, i.e. 1Set:SetSet1_{\mathbf{\text{Set}}} : \mathbf{\text{Set}} \rightarrow \mathbf{\text{Set}} is such that 1Set(X)=X1_{\mathbf{\text{Set}}} ( X ) = X and 1Set(f)=f1_{\mathbf{\text{Set}}} ( f ) = f. On the other hand, P2{\mathscr{P}}^2 denotes the functor that takes a set to the power set of it’s power set. This are both natural transformation.

We not only know how to take a set XX to it’s power set P\mathscr{P} and a function f:XYf : X \rightarrow Y to the corresponding P(f):P(X)P(Y)\mathscr{P} ( f ) : \mathscr{P} ( X ) \rightarrow \mathscr{P} ( Y ), we also know how to convert each element of XX into an element of P(X)\mathscr{P} ( X ) — using the map ηX\eta_X given by x{x}x \mapsto \{ x \} — and how to flatten an element of P(P(X))\mathscr{P} \left ( \mathscr{P} ( X ) \right ) into an element of P(X)\mathscr{P} ( X ) — using the map μX\mu_X given by SsSsS \mapsto \cup_{s \in S} s. Can you see where I’m going?

What is a Monad, Really?

A monad is simply an endofunctor MM — a functor from a category to itself — equipped with two natural transformations η:1CM\eta : 1_{\mathscr{C}} \Rightarrow M and μ:M2M\mu : M^2 \Rightarrow M such that the following diagrams commute:

Commutative diagram
Commutative diagram
Commutative diagram
Commutative diagram

The idea is that we have natural and consistent ways of converting elements of an object X:CX : \mathscr{C} into elements of M(X)M ( X ) and elements of M(M(X))M \left ( M ( X ) \right ) into elements of M(X)M ( X ) — this is sometimes called flattening.

For the haskellers out there, η\eta is return and μ\mu is (>>= id). For the mathematicians out there, return is η\eta and (>>= f) is μYM(f)\mu_Y \circ M ( f ).

Examples

The power set functor P\mathscr{P} equipped with the natural transformations η:1SetP\eta : 1_{\mathbf{\text{Set}}} \Rightarrow \mathscr{P} and μ:P2P\mu : {\mathscr{P}}^2 \Rightarrow \mathscr{P} given by ηX(x)={x}\eta_X ( x ) = \{ x \} e μX(S)=sSs\mu_X ( S ) = \cup_{s \in S} s is a monad! In fact, all of the following diagrams commute:

Commutative diagram of the power set monad
Commutative diagram of the power set monad
Commutative diagram of the power set monad
Commutative diagram of the power set monad

Conclusion

I hope this helps someone to understand monads. Happy haskelling!