What does the curvature of a surface have to do with tensor products?

The question I would like to address in this article is: what is a tensor? This question has two answers. If you ask an algebraist, he (she) will tell you it is an element of the tensor product of two modules. If you ask a geometer, she (he) will ramble about “global constructions that only depend on point-wise values” for hours on end. We should note that we will primarily focus on what a tensor is from the perspective of a geometer, the intuition behind it and how we get from that to the usual formalism.

Frankly, I feel like there isn’t much to explain, yet I never had this explained to me and I always felt it was difficult reconcile my intuition with the formalism most commonly adopted. This is the primary reason I wrote this article: I would have loved to read it in the past.

In differential geometry and related fields, information can often be obtained by passing from the non-linear to the linear via infinitesimal approximations. Often times this comes in form of CC^\infty-linear functions between the spaces of smooth sections of two fiber bundles. Specifically, if MM is a smooth manifold and EME \rightarrow M, FMF \rightarrow M are vector bundles over MM, then the sets Γ(E)\Gamma ( E ) and Γ(F)\Gamma ( F ) of smooth (global) sections of EE and FF, respectively, have a natural structure of a C(M)C^\infty ( M )-modules, and sometimes linear maps Γ(E)Γ(F)\Gamma ( E ) \rightarrow \Gamma ( F ) show up. If such a map τ:Γ(E)Γ(F)\tau : \Gamma ( E ) \rightarrow \Gamma ( F ) satisfies the condition that τ(ξ)p\tau {\left ( \xi \right )}_p depends only on ξp\xi_p — and not ξ\xi on its entirety — then τ\tau is called a tensor.

Often times it is convenient to also consider multilinear maps τ:Γ(E1)×Γ(E2)××Γ(En)F\tau : \Gamma ( E_1 ) \times \Gamma ( E_2 ) \times \cdots \times \Gamma ( E_n ) \rightarrow F — i.e. maps that are linear in each coordinate. Again, if τ(ξ1,ξ2,,ξn)p\tau(\xi^1, \xi^2, \ldots, \xi^n)_p is determined by ξpi\xi_p^i then τ\tau is called a tensor. The classical examples of tensors are differential forms. A perhaps more interesting example is a Riemannian metric: for each point pMp \in M we fix a positive-definite bilinear form gp:TpM×TpMR\text{g}_p : T_p M \times T_p M \rightarrow \mathbb{R} which “varies smoothly with pp”. This construction induces a tensor

g:X(M)×X(M)C(M)Γ(M×R)\mathrm{g} : \mathfrak{X}(M) \times \mathfrak{X}(M) \to C^\infty(M) \cong \Gamma(M \times \mathbb{R})

where (g(V1,V2))(p)=gp(Vp1,Vp2)\left ( \text{g} ( V^1 , V^2 ) \right ) ( p ) = \text{g}_p ( V_p^1 , V_p^2 ).

This is what a tensor is supposed to be: for each pMp \in M we fix some multilinear function between the pp-fibers of some vector bundles that “varies smoothly with pp”. The meaning of “varies smoothly with pp” is still imprecise, dare I not say unclear. We should point out that often times it is more convenient to define tensors in terms of global sections rather than defining the fiber-wise transformations, such as in the case of the curvature tensor R(X,Y)Z=XYZYXZ[X,Y]ZR ( X , Y ) Z = \nabla_X \nabla_Y Z - \nabla_Y \nabla_X Z - \nabla_{[ X , Y ]} Z of a connection \nabla or the Nijenhuis tensor N(X,Y)=[X,Y]+J[JX,Y]+J[X,JY][JX,JY]N ( X , Y ) = [ X , Y ] + J [ J X , Y ] + J [ X , J Y ] - [ J X , J Y ] of an almost complex structure JJ.

Hence the need to consider tensors in geometry. Working with multilinear maps can be a bit of an annoyance, however. It would be convenient if we could somehow look at a tensor as a straight linear map — instead of a multilinear map. This brings us to the algebraic answer to our initial question. Given a ring RR and two RR-modules MM and NN, their tensor product MRNM \otimes_R N is the RR-module which enjoys the universal property that

HomR(MRN,L)BilR(M×N,L),\text{Hom}_R \left ( M \otimes_R N , L \right ) \cong \text{Bil}_R \left ( M \times N , L \right ) ,

where BilR(M×N,L)\text{Bil}_R \left ( M \times N , L \right ) is the module of RR-bilinear maps M×NLM \times N \rightarrow L.

In other words, RR-multilinear maps M1×M2××MnNM_1 \times M_2 \times \cdots \times M_n \rightarrow N naturally correspond to RR-linear maps M1M2MnNM_1 \otimes M_2 \otimes \cdots \otimes M_n \rightarrow N. We should point out that the tensor product of modules can always be shown to exist by means of an explicit construction — whose elements are usually called tensors. If we fix R=RR = \mathbb{R}, this construction induces a construction in the category of vector bundles over some fixed manifold MM: if EiME_i \rightarrow M are bundles over MM, there is a vector bundle E1E2EnME_1 \otimes E_2 \otimes \cdots \otimes E_n \rightarrow M whose fibers are

(E1E2En)|p=E1|pE2|pEn|p\left ( E_1 \otimes E_2 \otimes \cdots \otimes E_n \right ) \text{|}_p = E_1 \text{|}_p \otimes E_2 \text{|}_p \otimes \cdots \otimes E_n \text{|}_p

The relationship between these two notions of tensor should now be clear: tensors Γ(E1)×Γ(E2)××Γ(En)Γ(F)\Gamma ( E_1 ) \times \Gamma ( E_2 ) \times \cdots \times \Gamma ( E_n ) \rightarrow \Gamma ( F ) are called tensors because they correspond to C(M)C^\infty ( M )-linear maps

Γ(E1)C(M)Γ(E2)C(M)C(M)Γ(En)Γ(F),\Gamma ( E_1 ) \otimes_{C^\infty ( M )} \Gamma ( E_2 ) \otimes_{C^\infty ( M )} \cdots \otimes_{C^\infty ( M )} \Gamma ( E_n ) \rightarrow \Gamma ( F ) ,

which are in turn canonically identified with C(M)C^\infty ( M )-linear maps

Γ(E1E2En)Γ(F)\Gamma \left ( E_1 \otimes E_2 \otimes \cdots \otimes E_n \right ) \rightarrow \Gamma ( F )

In fact, there’s a natural isomorphism of sheaves of CC^\infty-modules Γ(,E1)CΓ(,E2)CCΓ(,En)Γ(,E1E2En)\Gamma(-, E_1) \otimes_{C^\infty} \Gamma(-, E_2) \otimes_{C^\infty} \cdots \otimes_{C^\infty} \Gamma(-, E_n) \cong \Gamma(-, E_1 \otimes E_2 \otimes \cdots \otimes E_n) 🤡

To recap: we’ve just shown that a tensor τ:Γ(E1)×Γ(En)Γ(F)\tau : \Gamma ( E_1 ) \times \cdots \Gamma ( E_n ) \rightarrow \Gamma ( F ) can be naturally identified with some τHomC(M)(Γ(E1En),Γ(F))\tau \in \text{Hom}_{C^\infty ( M )} \left ( \Gamma \left ( E_1 \otimes \cdots \otimes E_n \right ) , \Gamma ( F ) \right ). A natural question to ask ourselves at this point is: does τ\tau correspond to some τΓ(Hom(E1En,F))\tau \in \Gamma \left ( \text{Hom} \left ( E_1 \otimes \cdots E_n , F \right ) \right )? First of all, why does this make sense? Recall that given two vector spaces VV and WW, the set Hom(V,W)\text{Hom} ( V , W ) of linear transformations VWV \rightarrow W is again a vector space. Hence we can consider the vector bundle Hom(E1En,F)M\text{Hom} \left ( E_1 \otimes \cdots \otimes E_n , F \right ) \rightarrow M whose fibers are

Hom(E1En,F)|p=Hom(E1|pEn|p,F|p)\text{Hom} \left ( E_1 \otimes \cdots \otimes E_n , F \right ) \text{|}_p = \text{Hom} \left ( E_1 \text{|}_p \otimes \cdots \otimes E_n \text{|}_p , F \text{|}_p \right )

The previously mentioned example of Riemannian metrics does hint at an inclusion

i:Γ(Hom(E1En,F))HomC(M)(Γ(E1En),Γ(F)),i : \Gamma \left ( \text{Hom} \left ( E_1 \otimes \cdots \otimes E_n , F \right ) \right ) \rightarrow \text{Hom}_{C^\infty ( M )} \left ( \Gamma \left ( E_1 \otimes \cdots \otimes E_n \right ) , \Gamma ( F ) \right ) ,

which takes ηΓ(Hom(E1En,F))\eta \in \Gamma \left ( \text{Hom} \left ( E_1 \otimes \cdots \otimes E_n , F \right ) \right ) to iη:Γ(E1En)Γ(F)i \eta : \Gamma \left ( E_1 \otimes \cdots \otimes E_n \right ) \rightarrow \Gamma ( F ) with iη(ξ)p=ηp(ξp)i \eta {\left ( \xi \right )}_p = \eta_p \left ( \xi_p \right ) — notice this is precisely what we did to get from “a bilinear form in TpMT_p M for each pMp \in M” to a Riemannian metric seen as a tensor. The meaning of “a transformation at each fiber pp that varies smoothly with pp” is now much clearer too: this is a smooth section of Hom(E1En,F)\text{Hom} \left ( E_1 \otimes \cdots \otimes E_n , F \right ). The inclusion ii is not surjective. This is because in general if φ:Γ(E1En)Γ(F)\varphi : \Gamma \left ( E_1 \otimes \cdots \otimes E_n \right ) \rightarrow \Gamma ( F ) is a homomorphism the value of φ(ξ1,,ξn)p\varphi(\xi^1, \ldots, \xi^n)_p may very well depend on ξi\xi^i in their entirety, and not only on ξpi\xi_p^i.

This last statement is actually false! See the errata on this post.

We claim, however, that the image of ii consists precisely of the multilinear functions E1××EnFE_1 \times \cdots \times E_n \rightarrow F that are tensors — i.e. such that τ(ξ1,,ξn)p\tau(\xi^1, \ldots, \xi^n)_p is determined by ξpi\xi_p^i. Indeed, if we consider the map

s:T(E1××En,F)Γ(Hom(E1En,F))s : \mathcal{T}(E_1 \times \cdots \times E_n, F) \to \Gamma(\operatorname{Hom}(E_1 \otimes \cdots \otimes E_n, F))

given by sτp(v1,,vn)=τ(ξ1,,ξn)ps \tau_p (v_1, \ldots, v_n) = \tau(\xi^1, \ldots, \xi^n)_p, where T(E1××En,F)HomC(M)(E1En,F)\mathcal{T}(E_1 \times \cdots \times E_n, F) \subset \operatorname{Hom}_{C^\infty(M)}(E_1 \otimes \cdots \otimes E_n, F) is the subspace of tensors and ξiΓ(Ei)\xi^i \in \Gamma ( E_i ) are such that ξpi=vi\xi_p^i = v_i, we can very quickly check that i=s1i = s^{- 1}, establishing an isomorphism of C(M)C^\infty ( M )-modules

Γ(Hom(E1En,F))T(E1××En,F)\Gamma(\operatorname{Hom}(E_1 \otimes \cdots \otimes E_n, F)) \cong \mathcal{T}(E_1 \times \cdots \times E_n, F)

The definition of sτp(v1,,vn)s \tau_p (v_1, \ldots, v_n) does not depend on the choice of ξi\xi^i precisely because the value of τ(ξ1,,ξn)p\tau(\xi^1, \ldots, \xi^n)_p depends only on ξpi=vi\xi_p^i = v_i! In conclusion, a tensor τ:E1××EnF\tau : E_1 \times \cdots \times E_n \rightarrow F is called a tensor because it corresponds to a smooth section of Hom(E1En,F)\text{Hom} \left ( E_1 \otimes \cdots \otimes E_n , F \right ). To finish things of, I would like to conclude our discussion by explaining a small notational quirk the reader will probably encounter in the literature: most people refer to Hom(E1En,F)\text{Hom} \left ( E_1 \otimes \cdots \otimes E_n , F \right ) as E1EnFE_1^* \otimes \cdots \otimes E_n^* \otimes F.

This is because given two vector spaces VV and WW, the space Hom(V,W)\text{Hom} ( V , W ) is canonically isomorphic to VWV^* \otimes W. Taking V=E1|pEn|pV = E_1 \text{|}_p \otimes \cdots \otimes E_n \text{|}_p and W=F|pW = F \text{|}_p, this translates to an isomorphism of vector bundles Hom(E1En,F)E1EnF\text{Hom} \left ( E_1 \otimes \cdots \otimes E_n , F \right ) \rightarrow E_1^* \otimes \cdots \otimes E_n^* \otimes F. In fact, usually the differential structure of Hom(E1En,F)\text{Hom} \left ( E_1 \otimes \cdots \otimes E_n , F \right ) is defined via the identification with E1EnFE_1^* \otimes \cdots \otimes E_n^* \otimes F. This is the formalism generally adopted, which is to say, when a geometer says “a tensor” in a formal sense he most likely means “some τΓ(E1EnF)\tau \in \Gamma \left ( E_1^* \otimes \cdots \otimes E_n^* \otimes F \right )”.

Also, if FMF \rightarrow M is the trivial line bundle M×RM \times \mathbb{R}, one usually refers to E1EnFE_1^* \otimes \cdots \otimes E_n^* \otimes F by simply E1EnE_1^* \otimes \cdots \otimes E_n^*, because tensoring by M×RM \times \mathbb{R} is the same as doing nothing. For instance, a Riemmanian metric is most often defined as a tensor gΓ(TMTM)\text{g} \in \Gamma \left ( T^* M \otimes T^* M \right ) satisfying special conditions. That about wraps it up. I hope this helped someone 😛