# What is a Monad, Really? (Oct 21st, 2020)

Ever since I started learning mathematics I wondered what the actual fuck is a monad. I’ve been programming since high-school, and I started learning Haskell about six months before my first semester at university. Learning Haskell was quite a paradigm shift from the more traditional, imperative languages I was used too, but I eventually able to understand it enough that I could be productive with it.

I learned how to work with typical monads in Haskell, such as [a] and Maybe a, yet I wasn’t able to understand what that all meant in mathematical terms. Being a student of pure mathematics, that bothered me for a while. Well, I’m happy to say I was finally able to understand it. I’ll try to summarize my knowledge in this article, in the hopes that it helps other people understand what in God’s green earth is a Monad.

This is a sentimental journey trough Category Theory. It’s mostly meant for programmers who already know how to work with monads and for mathematicians curious about the applications of Category Theory in the field of Computer Science. This is not a Haskell tutorial and you don’t need to understand any of this to be productive in Haskell!

## The Formal Definition

I suppose you could look it up in Wikipedia, but I’ll write it down in here anyways. Monads are a very special type of endofunctors — functors from a category to itself. Let $\mathscr{\text{C}}$ be a category and let $M : \mathscr{\text{C}} \rightarrow \mathscr{\text{C}}$ be a functor. Let $\eta : 1_{\mathscr{\text{C}}} \Rightarrow M$ and $\mu : M^2 \Rightarrow M$ be natural transformations such that the following diagrams commute:

Then the triple $\left ( M , \eta , \mu \right )$ — often denoted by simply $M$ — is said to be a monad. Ok, but what does this all mean?

## What is a Category, Really?

Category theory can be though-of as a generalization of Set Theory: it’s a uniform language for describing sets with a given structure — those are called objects — and functions that preserves structure between such sets — those are called morphisms or homomorphisms. A category is nothing more than that: a collection of sets with a given structure together with a collection of transformations between such sets.

This is a lie! The objects of a category need not to be sets, and the morphisms of a category need not to be functions. However, they are typically thought-of as so.

For example, the category $k \text{-} \text{Vect}$ of vector spaces over $k$ is the collection of all vector-spaces over $k$, together with the collection of all linear transformations between such vector-spaces.

Note that the identity function $i d_V : V \rightarrow V$ is a linear transformation. Furthermore, if $T : V_1 \rightarrow V_2$ and $S : V_2 \rightarrow V_3$ are linear transformations, then the composition $S \circ T : V_1 \rightarrow V_3$ is a linear transformation. This is a key concept in Category Theory! The identity function and the composition of two morphisms must be morphisms for a suitable collection to be called a category!

## What is a Functor, Really?

A functor is a map between categories: it takes objects and morphisms as input and returns objects and morphism from another category. In other words, a functor $F : \mathscr{C} \rightarrow \mathscr{D}$ is a map that associates an object $F ( X ) : \mathscr{D}$ for each object $X : \mathscr{C}$ and a marphism $F ( f ) : F ( X ) \rightarrow F ( Y )$ for each morphism $f : X \rightarrow Y$ in $\mathscr{C}$. For $F$ to be a functor it must also preserve identity — i.e. $F ( 1_X ) = 1_{F ( X )}$ — and composition:

The idea behind this is that $F$ not only maps objects to objects — like a regular function or a morphism — it also gives us a consistent way to turn every morphism in $\mathscr{C}$ into a morphism in $\mathscr{D}$.

This mechanism for converting morphisms in $\mathscr{C}$ to morphisms in $\mathscr{D}$ is sometimes referred-to as penetration, such as in Haskell’s standard library:

class Functor f where
-- fmap g fa "penetrates" g in fa.
-- A implementation of fmap is expected to
-- respect the composition of functions, i.e.
-- fmap (h . g) == (fmap h) . (fmap g).
fmap :: (a -> b) -> f a -> fb

Note that an instance of Functor is an endofunctor of the category of Haskell types.

### Example

Consider the map $\mathscr{P} : \mathbf{\text{Set}} \rightarrow \mathbf{\text{Set}}$ where $\mathscr{P} ( X )$ is the power set of $X$ and $\mathscr{P} ( f )$ — where $f : X \rightarrow Y$ — is the function that takes a subset of $X$ to it’s image under $f$.

Here $\mathbf{\text{Set}}$ denotes the category of sets, where objects are sets and morphisms are regular functions. We not only know how to map a set to it’s power set, we also know to how turn every function between sets into a function between their corresponding power sets. This is the power set functor.

## What is a Natural Transformation, Really?

Natural transformations are maps between functors: it takes objects as inputs and returns morphisms between the images of those objects under different functors. If $F , G : \mathscr{C} \rightarrow \mathscr{D}$ are functors a natural transformation $\eta : F \Rightarrow G$ from $F$ and $G$ takes $X : \mathscr{C}$ to a morphism $\eta_X : F ( X ) \rightarrow G ( X )$. For $\eta$ to be considered a natural transformation the following diagram must commute for each $X , Y : \mathscr{C}$ and $f : X \rightarrow Y$:

The idea in here is that you have natural way to convert the action of $F$ to the action of $G$. In other words, $\eta : F \Rightarrow G$ is a morphism between $F$ and $G$ in the category of functors between $\mathscr{C}$ and $\mathscr{D}$ — where the objects are functors between $\mathscr{C}$ and $\mathscr{D}$.

### Example

Consider the maps $\eta : 1_{\mathbf{\text{Set}}} \Rightarrow \mathscr{P}$ and $\mu : {\mathscr{P}}^2 \Rightarrow \mathscr{P}$ given by $\eta_X ( x ) = \{ x \}$ and $\mu_X ( S ) = \cup_{s \in S} s$, respectively. Here $1_{\mathbf{\text{Set}}}$ denotes the identity functor on the category of sets, i.e. $1_{\mathbf{\text{Set}}} : \mathbf{\text{Set}} \rightarrow \mathbf{\text{Set}}$ is such that $1_{\mathbf{\text{Set}}} ( X ) = X$ and $1_{\mathbf{\text{Set}}} ( f ) = f$. On the other hand, ${\mathscr{P}}^2$ denotes the functor that takes a set to the power set of it’s power set. This are both natural transformation.

We not only know how to take a set $X$ to it’s power set $\mathscr{P}$ and a function $f : X \rightarrow Y$ to the corresponding $\mathscr{P} ( f ) : \mathscr{P} ( X ) \rightarrow \mathscr{P} ( Y )$, we also know how to convert each element of $X$ into an element of $\mathscr{P} ( X )$ — using the map $\eta_X$ given by $x \mapsto \{ x \}$ — and how to flatten an element of $\mathscr{P} \left ( \mathscr{P} ( X ) \right )$ into an element of $\mathscr{P} ( X )$ — using the map $\mu_X$ given by $S \mapsto \cup_{s \in S} s$. Can you see where I’m going?

## What is a Monad, Really?

A monad is simply an endofunctor $M$ — a functor from a category to itself — equipped with two natural transformations $\eta : 1_{\mathscr{C}} \Rightarrow M$ and $\mu : M^2 \Rightarrow M$ such that the following diagrams commute:

The idea is that we have natural and consistent ways of converting elements of an object $X : \mathscr{C}$ into elements of $M ( X )$ and elements of $M \left ( M ( X ) \right )$ into elements of $M ( X )$ — this is sometimes called flattening.

For the haskellers out there, $\eta$ is return and $\mu$ is (>>= id). For the mathematicians out there, return is $\eta$ and (>>= f) is $\mu_Y \circ M ( f )$.

### Examples

The power set functor $\mathscr{P}$ equipped with the natural transformations $\eta : 1_{\mathbf{\text{Set}}} \Rightarrow \mathscr{P}$ and $\mu : {\mathscr{P}}^2 \Rightarrow \mathscr{P}$ given by $\eta_X ( x ) = \{ x \}$ e $\mu_X ( S ) = \cup_{s \in S} s$ is a monad! In fact, all of the following diagrams commute: