The Yoneda lemma is a not at all obvious, but totally fundamental, result in category theory that makes frequent appearances in programming, too. Luckily, it’s not actually hard to prove and makes intuitive sense once you see how it works. In programming terms, it constructs an isomorphism between a datatype and a related datatype of functions which gives us ways to optimize our programs by doing a kind of map and flatmap fusion: we turn iterated maps or flatmaps into function composition, compose, then evaluate the function we’ve built up once at the end.
In this post, we’ll prove(-ish) the Yoneda lemma, do a quick example of map fusion, then look at currying of functions and functors in terms of Yoneda. The punchline will be a derivation of the right Kan extension of a functor using Yoneda and currying. The Codensity monad is a special case of this.
Fortunately, this all requires only basic category theory, nothing fancy. We won’t work out every detail, but we will go through all the steps and point out what is left to show at each stage.
I’ll assume familiarity with categories, functors, and natural transformations,
but start with a dictionary between math and
programming and will give the programming analogues of the math throughout.
Code will be in aspirational Scala:
Scala with the
forall keyword borrowed from Haskell.
Math <> Programming
Here is everything we’ll need. Feel free to skip this on a first pass and refer to it as needed.
- we’ll write composition as juxtaposition gf or with a dot g ⋅ f
if things are getting cluttered. In programming, well write it as
g fwith a space
- we’ll write Set for the category of sets: the objects are sets, the arrows are functions, composition is composition of functions. In programming terms, these are types, functions between types, and composition of functions
- we’ll write C for a general category, A, B for objects of C, f : A → B for an arrow from A to B and C(A,B) for the set of arrows from A to B. The typography is a little subtle here: C is not italicized whereas A and B are. It will be clear in context when we mean a whole category and when we mean a particular object of that category
- an arrow f : A → B corresponds to a function
f: A => B
- the set C(A,B) corresponds to the set or type
A => B
- we’ll write F : C → D for a functor from a category C to a category D, F(A) for F applied to A and F(f) : F(A) → F(B) for F applied to f : A → B
- functors F : Set → Set correspond to functors
F[_]in programming; F(A) to
F[A]and F(f) to
F.map(f): F[A] => F[B]. As we know,
F.map(f).map(g) == F.map(gf)
- given a category C and an object A, we’ll write C(A,−) for the hom functor C → Set which sends an object B to the set of arrows C(A,B) and an arrow f : B → C to the function f* : C(A,B) → C(A,C) given by composing with f
- C(A,−) corresponds to
forall B . A => B, which is the Reader functor
Reader[A, B]for a fixed type
Aand variable type
- f* : C(A,B) → C(A,C) corresponds to
f: B => Cwhich is indeed defined to be composing with
- given two functors F, G : C → D, we’ll write η : F ⇒ G for a natural transformation from F to G and Hom(F,G) for the set of natural transformations from F to G. A natural transformation has one arrow for every object A and we’ll write ηA : F(A) → G(A) for the A component of η
- a natural transformation η : F ⇒ G corresponds to
a polymorphic function
forall A . η[A]: F[A] => G[A]which satisfies
η[B] F.map(f) == G.map(f) η[A], or equivalently, makes a square commute. For example,
headOptionis a natural transformation from
- the set Hom(F,G) corresponds to the type
forall A . F[A] => G[A]of polymorphic functions which similarly make a square commute
- we’ll write DC for the category whose objects are functors C → D and arrows are natural transformations. In this case DC(F,G) is Hom(F,G) and we’ll write either.
Navigating between corresponding ideas in math and programming is a fraught endeavor: definitions and assumptions may subtley differ and everyone has strong feelings about notation. I hope that these choices make this material comprehensible to the largest possible audience. The reason not to just write everything in code is that the math concepts are often more general and math has explicit and precise definitions where programming has laws that can go unmentioned or unchecked. As we’ll see, we’ll need this precision ahead.
The Yoneda Lemma
The setup is:
we have a category C, an object A, and we make the hom functor C(A,−).
Given another set-valued functor G : C → Set,
let’s write YG(A) = Hom(C(A,−),G)
for the set of natural transformations C(A,−) ⇒ G.
In programming notation, we can write YG(A) as
forall B . (A => B) => G[B].
It may not be a huge surprise that this is exactly
The Yoneda lemma first constructs a bijection of sets ΦA : YG(A) → G(A). It then says that, letting A vary, Φ : YG ⇒ G defines a natural isomorphism of functors. We’ll unravel these statements as we go.
First, we construct ΦA and its inverse ΨA. The function ΦA : YG(A) → G(A) is defined by ΦA(η) = ηA(1A). Let’s unpack this: a natural transformation η : C(A,−) ⇒ G has one component ηB : C(A,B) → G(B) for every object B; choosing B to be A, there is a canonical element inside C(A,A), the identity 1A; applying ηA to that, we get an element of G(A).
Next, ΦA is a bijection of sets. This sounds bonkers: a natural transformation is a function ηB : C(A,B) → G(B) for every B. This looks like much more information than the value of one of those functions on a single input. It turns out that being a natural transformation is hard: the commutative diagram that a natural transformation has to satisfy means that the components ηB for different B have to fit together in the right way. The Yoneda lemma says that, in fact, for η to be a natural tranformation, these requirements are so restrictive that all of the ηB are entirely determined by the single value ηA(1A) in G(A).
Let’s construct the inverse ΨA of ΦA. Since A is fixed for this part and for x ∈ G(A), ΨA(x) is a natural tranformation which itself has a component for every B, we’ll drop the A subscript for now. So, given x = Φ(η) ∈ G(A), for Ψ(x) to be inverse to Φ, Ψ(x)A(1A) must be x. This tells the value of one component of Ψ(x) on a single input. We can define the entire rest of Ψ(x) in one fell swoop.
The trick is that we can write g : A → B as g*(1A) for any g because g*(1A) = g ⋅ 1A = g. Now, for Ψ(x) to be a natural transformation C(A,−) ⇒ G, it must satisfy
Ψ(x)B ⋅ g* = G(g) ⋅ Ψ(x)A
(Drawing the diagram for yourself really helps here.) Evaluating both sides on 1A, we get
Ψ(x)B(g*1A) = Ψ(x)B(g)
on the left and
G(g)(Ψ(x)A(1A)) = G(g)(x)
on the right, since Ψ(x)A(1A) = x.
So,we have no choice but to define Ψ(x)B(g) = G(g)(x).
In programming: for
Ψ(x) is the polymorphic function
g => x.map(g) for
g: A => B.
The remaining work is to show that Ψ(x) is a natural transformation, that is, for any g : B → C, that
Ψ(x)C ⋅ g* = G(g) ⋅ Ψ(x)B
not only for arrows g whose domain is A and not only when we evaluate on a specific input. Explicitly, we have to show that given any f : A → B, we have Ψ(x)C(g*(f)) = G(g)(Ψ(x)B(f)).
As claimed, once we know that Ψ(x)A sends 1A to x, everything else is entirely determined by the fact that it has to be a natural transformation.
Brining back the subscripts, we defined ΨA, but still have to show that ΨA and ΦA really are inverses. This is what tells us that both are bijections.
So far, we have a functor G and, for each A, a set YG(A) and a bijection ΦA : YG(A) → G(A). Next, we’ll see that YG is a functor C → Set. This means that the sets YG(A) for varying A are not unrelated, we also get, for every arrow f : A → B, a function YG(A) → YG(B) in a way that preserves composition and identity arrows.
Once we understand YG as a functor, we’ll get to the second part of the Yoneda lemma: that ΦA is natural in A. This means that the ΦA are the components of a natural transformation Φ : YG ⇒ G. Since we’ve already shown that every ΦA is an isomorphism, this says that Φ is a natural isomorphism. So it is not just an isomorphism of sets for each A, it is an isomorphism of functors.
First, that YG is a functor: given an arrow g : A → B,
we get a function YG(g) : YG(A) → YG(B) by precomposing with g.
More explicitly: given a natural transformation η : C(A,−) ⇒ G,
we get a natural transformation C(B,−) ⇒ G whose C component
C(B,C) → G(C) sends f ∈ C(B,C) to ηC(fg).
That is, we take f, precompose with g to get an element of C(A,C),
and apply ηC to that.
In programming terms, we just defined
The work here is first to show that YG(g)(η) really does define a natural transformation and then that, with this definition of YG on arrows, YG really is a functor: sends identities to identities and composition to composition.
Finally: Φ is a natural transformation. Now that we know what YG(g) does on an arrow g : A → B, the work here is to show that G(g) ⋅ ΦA = ΦB ⋅ YG(g), or in diagrams, that the square with these four functions as sides commutes. The same thing in terms of Ψ is ΨB ⋅ G(g) = YG(g) ⋅ ΨA which, in code, is
Ψ[B] G.map(g) == Yoneda[G].map(g) Ψ[A]
So we can
_.map(g) and apply
Ψ in either order.
So, having hand-waved ourselves to the end: ΦA is a bijection for every A and Φ : YG ⇒ G is a natural isomorphism. That is!
Let’s see how to use Yoneda to do map fusion.
G to be
Yoneda datatype is
Yoneda[List, A] = forall B. (A => B) => List[B]
Ψ[A]: List[A] => Yoneda[List, A]
sends a list
ls: List[A] to the function
f => ls.map(f) for any
f: A => B.
Going the other way,
Φ[A]: Yoneda[List, A] => List[A]
evaluates a polymorphic function on
It should again be immediately clear that
Φ[A]Ψ[A]: List[A] => List[A]
is the identity function and not at all clear that the composition
the other way is the identity, but that’s we just showed.
Now, given a list
f: A => B, g: B => C, h: C => D,
we know that
Ψ[D](ls.map(f).map(g).map(h)) == Ψ[A](ls).map(f).map(g).map(h)
because the lemma tells us that we can map functions and apply
in either order.
Φ is inverse to
Ψ, we have
ls.map(f).map(g).map(h) == Φ[D] Ψ[A](ls).map(f).map(g).map(h)
So, to compute
we can apply
Ψ, map on the
Yoneda side, then apply
Φ to get back.
Ψ, we get
(Ψ[A](ls)).map(f).map(g).map(h) = (k => ls.map(khgf))
because mapping on the
Yoneda side is precomposing.
Φ to this, we get
Φ is evaluation on the identity function.
Lo and behold, our maps are fused.
Currying is the isomorphism
Set(C×A,B) ≅ Set(C,BA)
between sets of functions defined by curry(f)(c) = f(c,−)
where BA is the set of functions A → B.
In programming notation,
curry: ((C,A) => B) => (C => A => B)
So far, we haven’t done any category theory yet. We can verify that this is a bijection by writing down a function in the other direction and checking that they are inverses to one another.
The Yoneda view of this is that Set(−,BA) is a hom-functor Set → Set, Set(−×A,B) is another Set-valued functor, and uncurrying defines an element
uncurry ∈ Hom(Set(−,BA),Set(−×A,B))
which is a natural isomorphism. We are using the contravariant version of Yoneda here: both functors turn functions A → B into functions F(B) → F(A). Everything else works the same.
As before, there is work left to do: we have to verify that both of these really are functors and the currying really is a natural transformation.
The Yoneda bijection sends the uncurrying natural isomorphism to an element uncurry(1BA) ∈ Set(BA×A,B): the evaluation map eval(f,a) = f(a). Moreover, the evaluation map satisfies a universal property: every function f : C × A → B can be written as the composition of curry(f) × 1A : C × A → BA × A with eval : BA × A → B. This is exercise 2.3.iii in CTIC.
We can also curry in the category of functors Set → Set. Given two functors F, G : Set → Set, functor composition G ⋅ F plays the role of the cartesian product A × B of sets. There is a functor RanGF which corresponds to the arrows in SetSet from F to G so plays the role of BA, but it’s not immediately obvious what it is: the natural transformations F ⇒ G are not themselves a functor Set → Set in any obvious way. Nonetheless, we can use the Yoneda lemma to get a handle on it.
Accepting that the functor RanGF : Set → Set exists, currying is a natural isomorphism Hom(−⋅F,G) ≅ Hom(−,RanFG). Here again we have a hom-functor Hom(−,RanFG) from SetSet → Set which sends a functor H to the set of natural transformations H ⇒ RanFG and another Set-valued functor Hom(−⋅F,G) which sends a functor H to the set of natural transformations HF ⇒ G.
With a judicious choice of H, we can use the Yoneda lemma to get our hands on RanGF: let’s choose H : Set → Set to be a hom-functor Set(A,−). Now, currying gives us an isomorphism
Hom(Set(A,F(−)),G) ≅ Hom(Set(A,−),RanGF)
where Set(A,F(−)) is exactly the functor Set(A,−) composed with F.
In programming this is
forall B. A => F[B] which is
ReaderT or Kleisli in Cats.
Now we are in good shape: we have natural bijections
RanGF(A) ≅ Hom(Set(A,−),RanGF) ≅ Hom(Set(A,F(−)),G)
where the first is the Yoneda bijection and the second is uncurrying.
So this gives us a description of RanGF: whatever it is,
by Yoneda, it is naturally isomorphic to the functor Set → Set
which sends a set A to the set of natural transformations Set(A,F(−)) ⇒ G.
In programming notation:
forall B. (A => F[B]) => G[B]
which is exactly
Ran in Haskell.
There are two special cases of this construction:
when F is the identity functor and when F is G.
In the first case, we get back our
Yoneda datatype from before;
in the second case, we get the Codensity monad
forall B. (A => F[B]) => F[B].
F is already a monad, then the Codensity monad is a monad transformer,
but it is defined and is a monad even if
F is only a functor.
Similar to our
List example from before,
F by its Codensity monad lets us replace iterated
flatMaps with function composition so does a kind of
We won’t work out an example here, but see the links below.
That’s it! The Yoneda lemma, currying, and fusion. Here is a list of comments and things we left out.
A few more pieces of Yoneda that we didn’t mention:
- Φ is also natural in G
- the Yoneda embedding
- some other nice corollaries, like, that A ≅ B if and only if C(A,−) ≅ C(B,−)
If you made it this far, you’ll love this talk by Emily Riehl on categorifying cardinal arithmetic (pdf, slides).
We can talk about currying in these cases because SetSet has a bunch of extra structure and Set has even more.
RanGF is a general construction called the right Kan extension of F along G that, given F : C → E and G : C → D produces a functor RanGF : D → E which extends F to D along G. There is a left Kan extension as well. Exercise 6.1.iii in CTIC has you work out examples of both by hand. A good followup is to work out what the limit and colimit formulas from the next section give for those examples.
I gave a talk, Difference Lists and the Codensity Monad, a while ago at the Bay Area Haskell Meetup: video and slides and code. In it, I go through an example of replacing a monad by its Codensity monad to turn an O(n2) program into an O(n) program. There is a complete list of references in those slides, but the highlights are below.
These ideas appear elsewhere, too: whatever profunctor optics are, they use the Yoneda lemma to efficiently traverse nested datastructures. That hopefully sounds not totally unreasonable now.
Some fair questions at this point are: What actually is a Kan extension? And what about Coyoneda? These are great questions!….for a another time!
- Category Theory in Context (pdf) by Riehl
- Notions of Computation as Monoids by Rivas & Jaskelioff
- Kan Extensions for Program Optimisation Or: Art and Dan Explain an Old Trick (pdf) by Hinze