# Kummer theory and the weak Mordell-Weil theorem

In this post we discuss the notion of Kummer theory in its general form, and how this leads to a proof of the (weak) Mordell-Weil theorem.

# Motivation

One first learns about Kummer theory in a first course in algebra. There, one learns that if $K$ is a field of characteristic $p\geqslant 0$, $(n,p)=1$, then all the cyclic extensions of $K$ of order $n$ are of the form $K(\sqrt[n]{a})$ for some $a\in K^\times$. The Galois group then being naturally identified with $\mu_n(K)=\mu_n(\overline{K})$ with $\zeta\in\mu_n(K)$ corresponding to $\sigma_\zeta:\sqrt[n]{a}\mapsto \zeta\sqrt[n]{a}$.

Things probably stay pretty calm on the ‘Kummer theory’ front of one’s learning for a while after this initial encounter. But, eventually one then learns, at least if you are an algebraic geometer, about the notion of a finite Galois cover of a scheme. It is a theory which, in a very literal sense, is the generalization of ‘Galois theory’ to the general theory of schemes. One then is confronted with an obvious question: in this ‘generalized Galois theory’ is there a ‘generalized Kummer theory’?

Namely, how does one describe connected cyclic Galois covers of a scheme $X$? Is there a relationship with the ‘roots of functions on $X$‘? Is it true that if $X'\to X$ is a finite cyclic Galois cover (of degree $n$) of $X$, then naturally $\mathrm{Gal}(X'/X)\cong \mu_n$? Of course, analogizing with the case above, one should assume that $n$ is invertible on $X$ (the replacement of $(n,\mathrm{char}(K))=1$).

One can think about this from a more cohomological viewpoint. Namely, the connected cyclic Galois covers of $X$ (which is connected) with Galois group contained in $\mathbb{Z}/n\mathbb{Z}$ correspond precisely the group $\mathrm{Hom}_\mathrm{cont.}(\pi_1^\mathrm{\acute{e}t}(X,\overline{x}),\mathbb{Z}/n\mathbb{Z})$. But, as we have discussed before we can describe this group equally well as $H^1_\mathrm{\acute{e}t}(X,\mathbb{Z}/n\mathbb{Z})$. And thus, we see that we are trying to describe cohomology or, loosely, some sort of ‘abelianized’ Galois theory of $X$.

Of course, something slightly more subtle must be going on here. Indeed, in the case of fields we needed to assume that $K$ contained an $n^\text{th}$ root of unity. What is the analogy here? What do we need to assume about $X$? Well, it turns out that the case of fields we learned in our youth was actually a bit misleading. The inclusion of the roots of unity in $K$ was a red herring which allowed us to ignore a technical detail. This detail, in terms of the étale cohomology perspective, was that Kummer theory is rightfully not about $H^1_\mathrm{\acute{e}t}(X,\mathbb{Z}/n\mathbb{Z})$ but instead of $H^1_\mathrm{\acute{e}t}(X,\mu_n)$. Of course, if $X$ contains the $n^\text{th}$ roots of unity (in its global sections) we have an identification of $\mu_n$ and $\underline{\mathbb{Z}/n\mathbb{Z}}$ explaining the connection with classical Kummer theory.

Now then, once one has understood Kummer theory in this context, another natural question presents itself. Namely, what is $\mu_n$ but the $n$-torsion of $\mathbf{G}_m$. One may then wonder what one can say about the $n$-torsion of a general group scheme/abelian sheaf. This perspective, together with some elbow grease, will lead to a nice, conceptual, geometric proof of the weak Mordell-Weil theorem, that if $K$ is a number field, $A/K$ an abelian variety, then $A(K)/nA(K)$ is finite for any $n\geqslant 1$. This is the hard step in the proof of the Mordell-Weil theorem that $A(K)$ is finitely generated.

Remark: The other part of the Mordell-Weil theorem is the so-called ‘theory of heights’. One shows that if an abelian group $A$ possesses a height function (something roughly measuring size) then the Descent Theorem shows that $A/mA$ being finite implies that $A$ is actually finitely generated (the obvious direction is obvious). One then uses the fact that $A(K)$ has a natural height function associated to any ample line bundle $\mathscr{L}$ on $A$. In fact, there is a canonical height function called the Neron-Tate height function.

# Hilbert’s theorem 90 in the abstract

Before we begin our pursuit of Kummer theory in earnest, we must begin by discussing what can be called the geometric version of Hilbert’s theorem 90. Recall firstly that Hilbert’s theorem 90 says that if $L/K$ is a cyclic Galois extension of order $n$, then all $x\in L^\times$ are of the form $\displaystyle \frac{\sigma(y)}{y}$ for some $y\in L^\times$. This follows from the general Galois computation that if $L/K$ is any Galois extension then $H^1(\mathrm{Gal}(L/K),L^\times)=0$, computed in the sense of group cohomology (for a reminder on group cohomology, look here).

Now, the vast generalization of Hilbert’s theorem 90 comes in the form of a theorem about the étale cohomology of the multiplicative group $\mathbf{G}_m$ on the étale site of any scheme. Namely, one can show that for any scheme $X$ one has the following set of isomorphisms

\begin{aligned}H^1_\mathrm{zar}(X,\mathbf{G}_m) & =\check{H}^1_\mathrm{zar}(X,\mathbf{G}_m)\\ &=\mathrm{Pic}(X_\mathrm{Zar})\\ &=\mathrm{Pic}(X_\mathrm{\acute{e}t})\\ &=\check{H}^1_\mathrm{\acute{e}t}(X,\mathbf{G}_m)\\ &=H^1_\mathrm{\acute{e}t}(X,\mathbf{G}_m)\end{aligned}

Where, here $\check{H}^1$ denotes Cech cohomology, the subscript $\mathrm{zar}$ denotes cohomology on the Zariski site of $X$ (i.e. ‘usual’ cohomology), $\mathrm{Pic}(X_\mathrm{Zar})$ denotes the ‘usual’ Picard group of $X$, and $\mathrm{Pic}(X_\mathrm{\acute{e}t})$ denotes the line bundles on the étale site of $X$.

To clarify the last point, we call a sheaf of abelian groups $\mathscr{L}$ on $X_\mathrm{\acute{e}t}$ a line bundle if there is an étale cover $\{U_i\to X\}$ such that $\mathscr{L}\mid_{U_i}\cong \mathcal{O}_{(U_i)_\mathrm{\acute{e}t}}$. Here, and for any scheme $Y$, the sheaf $\mathcal{O}_{Y_\mathrm{\acute{e}t}}$ denotes the sheaf associating to any $U\to Y$ étale the group $\mathcal{O}_U(U)$. In other words, it is the sheaf associated to the group scheme $\mathbf{G}_{a,Y}$.

More generally, we call an abelian $\mathcal{F}$ on $X_\mathrm{\acute{e}t}$ quasicoherent if locally on $X_\mathrm{\acute{e}t}$ it is the quotient of a power of $\mathcal{O}_{X_\mathrm{\acute{e}t}}$. More specifically, there is an étale cover $\{U_i\to X\}$ such that $\mathcal{F}\mid_{(U_i)_\mathrm{\acute{e}t}}$ is a quotient of $\mathcal{O}_{(U_i)_\mathrm{\acute{e}t}}^{\oplus \lambda}$ for some cardinal $\lambda$. One of the basic theorems in descent theory says that there is an equivalence of categories

$\mathsf{Qcoh}(X_\mathrm{zar})\to \mathsf{Qcoh}(X_\mathrm{\acute{e}t})$

given by associating to any $\mathcal{F}$ its etalification which is denoted $\mathcal{F}_\mathrm{\acute{e}t}$ and associates to any $\varphi:U\to X$ the group $(\varphi^\ast\mathcal{F})(U)$.

Now, it’s clear that this map restricts to a map

$\mathrm{Pic}(X_\mathrm{zar})\to\mathrm{Pic}(X_\mathrm{\acute{e}t})$

and its clear (since the Zariski site contains the étale site) that this map is injective. Moreover, since every étale line bundle $\mathscr{L}$ is clearly quasicoherent on $X_\mathrm{\acute{e}t}$ the above equivalence tells us that there is some quasicoherent Zariski sheaf $\mathcal{F}$ such that $\mathscr{L}=\mathcal{F}_\mathrm{\acute{e}t}$. That said, it’s not obvious that $\mathcal{F}$ is actually trivializable on $X_\mathrm{zar}$. This is the main content of the theorem.

Indeed, the other isomorphisms are entirely formal. Namely, the identification of $\mathrm{Pic}$ (Zariski or étale) with $\check{H}^1$ (Zariski or étale) follows from the general yoga of torsors. Then, the isomorphisms of $H^1$ with $\check{H}^1$ holds true for sheaf cohomology on any site (see the Cech-to-derived spectral sequence, which holds on any site).

To prove that étalification is actually a surjective map $\mathrm{Pic}(X_\mathrm{zar})\to\mathrm{Pic}(X_\mathrm{\acute{e}t})$ one needs to perform a more in-depth analysis. I refer to Milne’s Lectures on Étale Cohomology. It was pointed out to me by my friend Minseon Shin that there is actually a relatively easy way to prove this. Namely, suppose that $\mathcal{F}$ is a quasicoherent on $X_\mathrm{zar}$ which becomes a line bundle on $X_\mathrm{\acute{e}t}$. In particular, assume that there is an étale cover $\varphi:U\to X$ such that $\varphi^\ast\mathcal{F}$ is trivial. By standard arguments we can assume that $U$ and $X$ are both affine. Thus, we have a module $M$ on $\mathrm{Spec}(A)$ such that $M\otimes_A B$ (with $U=\mathrm{Spec}(B)$) free. We want to show that $M$ is a line bundle. Thus, we want to show that $M$ is finitely presented and flat. To see that it’s finitely presented merely choose a presentation

$0\to K\to B^n\to M\otimes_A B\to 0$

with $K$ a finitely generated $B$-module. Note that we may assume that the generators of $B^n$ go to tensors of the form $m_i\otimes 1$. Consider then the map $A^n\to M$ defined by sending $e_i$ to $m_i$ and let $K'$ be the kernel of this map. Note firstly that since

$A^n\otimes_A B\to M\otimes_A B$

is surjective, it follows from faithful flatness that $A^n\to M$ is surjective. So, we need only show that $K'$ is also finitely generated. But, again, faithful flatness shows that $K'\otimes_A B=K$. But, we didn’t use $K$ in the above to show that $M$ was finitely generated—we showed in general that if $M\otimes_A B$ is finitely generated then $M$ is finitely generated. Thus, $K=K'\otimes_A B$ is finitely generated implies that $K'$ is finitely generated, and so $M$ is finitely presented. So, it remains to show that $M$ is flat. But, for any $A$-module $N$ we need to check that $\mathrm{Tor}^i_A(M,N)=0$. But, again, by faithful flatness, it suffices to check that

$\mathrm{Tor}^i_A(M,N)\otimes_A B=\mathrm{Tor}^i_B(M\otimes_A B,N\otimes_A B)=0$

the last equality since $M\otimes_A B$ is flat.

Remark: Because of the above theorem, and since we’ll only talking about the étale or Zariski topology unless stated otherwise, we unambiguously denote the Picard group by $\mathrm{Pic}$.

Note that these isomorphisms do actually prove the classical version of Hilbert’s theorem 90. Namely, let $\mathrm{Spec}(K)$ be any field. Then, by the usual equivalence of $\mathsf{Ab}(X_\mathrm{\acute{e}t})\to G_K\mathsf{-mod}$ (the latter being discrete $G_K$-modules) which preserves cohomology we see that

$H^1\left(G_K,\left(K^\mathrm{sep}\right)^\times\right)=H^1_\mathrm{\acute{e}t}(\mathrm{Spec}(K),\mathbf{G}_m)$

But, by the above we know that the right hand side is equal to $\mathrm{Pic}\left(\mathrm{Spec}(K)\right)$ which is clearly $0$ (it’s a point!). Then, the usual injection $H^1(\mathrm{Gal}(L/K),L^\times)\to H^1(G_K,\left(K^\mathrm{sep}\right)^\times)$ proves the theorem in general.

# Kummer theory in the abstract

Now that we have a general understand of Hilbert’s theorem 90 we are able to deduce the ‘true nature’ of Kummer theorem. Namely, the key will be the so-called Kummer sequence.

In particular, let us now suppose that $X$ is any scheme on which $n$ is invertible (i.e. $X$ is a scheme over $\mathbb{Z}[\frac{1}{n}]$). Then, we claim that we have an exact sequence of sheaves on $X_\mathrm{\acute{e}t}$:

$1\to \mu_n\to \mathbf{G}_m\xrightarrow{x\mapsto x^n}\mathbf{G}_m\to 1\qquad\qquad\left(\begin{matrix}\text{Kummer}\\\text{sequence}\end{matrix}\right)$

called the Kummer sequence on $X$.

Note, that even though both $\mu_n$ and $\mathbf{G}_m$ are actually affine group schemes on $X$, that this is a sequence of étale sheaves on $X_\mathrm{\acute{e}t}$. In particular, the $n^\text{th}$-power map is not necesarily surjective map of schemes $\mathbf{G}_m\to\mathbf{G}_m$ necessarily—think about the map on $\mathbb{Q}$-points if $X=\mathrm{Spec}(\mathbb{Q})$.

Remark: One can show that the map $\mathbf{G}_m\to\mathbf{G}_m$ has trivial cokernel in the abelian category of, say, affine group schemes over $X$. But, note that this has no more content since, definitionally, the cokernel is the sheaf representing the cokernel sheaf!

Now, it’s clear that the map $\mu_n\to\mathbf{G}_m$ is injective, and so we should say why the map $\mathbf{G}_m\to\mathbf{G}_m$ is actually surjective (again, on the level of sheaves). This means that for any $X$-scheme $T$, and any section $s\in\mathbf{G}_m(T)$ we can find some étale cover $T'\to T$ such that $s\mid_{T'}$ is an $n^\text{th}$-power. But, this is simple. Namely, the scheme $T'=\underline{\mathrm{Spec}}\left(\mathcal{O}_T[U]/(U^n-s)\right)$ (i.e. locally adjoin an $n^\text{th}$ root of $s$) suffices. Note here that we see why it is necessary to assume that $n$ is invertible on $X$—else the scheme $T'$ just defined is not necessarily  étale (i.e. $U^n-s$ is not a ‘separable polynomial’).

So, now with this sequence, we can give the fundamental theorem of Kummer theorem:

Theorem 1(Fundamental theorem of Kummer theory): Let $X$ be a scheme over $\mathbb{Z}[\frac{1}{n}]$. Then, there is a short exact sequence

$1\to\mathbf{G}_m(X)/\mathbf{G}_m(X)^n\to H^1_\mathrm{\acute{e}t}(X,\mu_n)\to \mathrm{Pic}(X)[n]\to 1$

Here, for an abelian group $A$, $A[m]$ denotes the $m$-torsion of $A$.

The proof of this theorem is utterly trivial given the setup we have. Namely, taking the long exact sequence associated to the Kummer sequence, and using Hilbert’s theorem 90 to identify $H^1_\mathrm{\acute{e}t}(X,\mathbf{G}_m)$ with $\mathrm{Pic}(X)$, immediately gives the result.

Remark: Using the monodromy perspective we can think about this as giving a short exact sequence describing the group cohomology $H^1(\pi_1(X,\overline{x}),(\mu_n)_{\overline{x}})$.

# A different interpretation: principal homogenous spaces

## Principal homogenous spaces in general

Before we go on to give some applications of Kummer theory, let us give another interpretation of the above result. Namely, we should be able to geometrically describe the group $H^1_\mathrm{\acute{e}t}(X,\mu_n)$, and this sequence in Theorem 1, without recourse to the Kummer sequence.

Why? Well, recall from the general yoga of torsors that $H^1_\mathrm{\acute{e}t}(X,\mu_n)$ classifies the group of $\mu_n$-torsors. This does not sound very appealing, and certainly not very geometric. Namely, a $\mu_n$-torsor is some sheaf of sets $\mathcal{F}$ on $X_\mathrm{\acute{e}t}$ with a transitive action of $\mu_n$ which is ‘locally trivial’ (i.e. locally has a section). That said, since $\mu_n$ is an affine group scheme over $X$, we actually have a nice alternative way of thinking about $\mu_n$-torsors.

Namely, let us fix a flat group scheme $G$ locally of finite type over $X$. Then, call an $X$-scheme $P$ equipped with an action

$a:G\times_X P\to P$

(meaning that for all $X$-schemes $T$ we have an action of $G(T)\times P(T)\to P(T)$ which is functorial in $T)$ a principal homogenous space for $G$ if there is an étale cover $\{U_i\to X\}$ such that $P\mid_{U_i}$ is isomorphic to $G$ as a $G\mid_{U_i}$-space. Here we are thinking of $G$ as a $G$-space with left multiplication. Equivalently, one can require that $P$ is a faithfully flat $X$-scheme and the obvious map $G\times_X P\to P\times_X P$ is an isomorphism. The phrase principal $G$-bundle is also used in place of principal homogenous space for $G$.

Let us give the key example of this construction. Namely, what are the principal homogenous spaces for the group $\mathbf{G}_{m,X}$? Well, recall that a geometric line bundle on $X$ is an $X$-scheme $\varphi:L\to X$ such that there exists a Zariski cover $\{U_i\}$ of $X$ such that $\varphi^{-1}(U_i)\to U_i$ is isomorphic, as a $U_i$ scheme, to $\mathbb{A}^1_{U_i}$.

We then define an $X$-scheme associated to a geometric line bundle $L$ called the frame bundle of $L$. We denote it by $\mathrm{Frame}_L$. Intuitively $\mathrm{Frame}_L$ parameterizes ordered bases of the fibers of $L$ over $X$. More explicitly, $\mathrm{Frame}_L$ represents the functor on $X_\mathrm{\acute{e}t}$ defined by

$(U\to X)\mapsto \mathrm{Isom}_U(\mathbb{A}^1_U,L\times_X U)$

(where these are isomorphisms in category of geometric line bundles) and thus another reasonable name for $\mathrm{Frame}_L$ would be the isomorphism sheaf $\mathrm{Isom}(L,\mathbb{A}^1_X)$.

Note that there is a natural action of $\mathbf{G}_m$ on $\mathrm{Frame}_L$ given by taking a $g\in\mathbf{G}_m(U)$ and an isomorphism $\varphi:\mathbb{A}^1_U\to L\times_X U$ to the precomposition with the isomorphism $g:\mathbb{A}^1_U\to\mathbb{A}^1_U$. Moreover, note that this map is clearly a transitive action when $\mathrm{Frame}_L(U)$ is non-empty. Moreover, since $L$ is locally trivializable on the Zariski site, that $\mathrm{Frame}_L$ locally has a section. Thus, $\mathrm{Frame}_L$ is actually a $\mathbf{G}_m$-torsor

Recall also that there is an equivalence of categories between the category of geometric line bundles (where the morphisms are, on trivializing covers, linear) and the category of line bundles on $X$. Namely, to a line bundle $\mathscr{L}$ on $X$ we can consider the affine scheme $\underline{\mathrm{Spec}}(\mathrm{Sym}(\mathscr{L}))\to X$ where $\mathrm{Sym}(\mathscr{L}))$ is the quasicoherent $\mathcal{O}_X$-algebra given by

$\mathrm{Sym}(\mathscr{L})=\mathcal{O}_X\oplus\mathscr{L}\oplus\mathscr{L}^{\otimes 2}\oplus\cdots$

with the usual multiplication. This construction yields the equivalence. In fact, the sheaf of sections of $\underline{\mathrm{Spec}}(\mathrm{Sym}(\mathscr{L}))\to X$ is precisely $\mathscr{L}^{-1}$. Under this equivalence we can identify the torsor $\mathrm{Frame}_L$ with the torsor $\mathrm{Isom}(\mathcal{O}_X,\mathscr{L}^{-1})$ which, again, to a $\varphi:U\to X$ associates the set of $\mathcal{O}_U$ isomorphisms $\mathcal{O}_U\xrightarrow{\approx}\varphi^\ast\mathscr{L}$.

Remark: This is an important point which confused me for a long time. Namely, people will say that line bundles are $\mathbf{G}_m$-torsors. This is technically a lie. Namely, the most natural interpretation of this statement is that if $\mathscr{L}$ is a line bundle then $\mathscr{L}_\mathrm{\acute{e}t}$ is a $\mathbf{G}_m$-torsor on $X_\mathrm{\acute{e}t}$. Of course, this makes no sense–there is no $\mathbf{G}_m$ action. It is really the frame bundle naturally associated to $\mathscr{L}$ which is the $\mathbf{G}_m$-torsor.

So, now, in general, to a principal homogenous space $P$ for $G$ we can associate $G$-torsor on $X_\mathrm{\acute{e}t}$ as, not shockingly, the isomorphism sheaf $\mathrm{Isom}(P,G)$ which associates to a $U\to X$ the set of isomorphisms of $G$-spaces $P_U\to G_U$. Now, in general there is no reason to believe that this should create a bijection between the set of $G$-torsors and the set of principal homogenous spaces—in particular, there is no reason it should be surjective.

That said, we have the following:

Theorem 2: Let $X$ be a scheme and $G$ a smooth affine group scheme over $X$. Then, there is an equivalence

$\left\{\begin{matrix}\text{Principal homogenous}\\ G\text{-spaces}\end{matrix}\right\}\leftrightarrow G\text{-torsors}$

given by $P\mapsto \mathrm{Isom}(G,P)$.

Here the categories are both equipped with $G$-equivariant morphisms.

Remark: For people that know about stacks, the above just says that if $G$ is smooth affine, then the stack $BG=[\mathrm{pt}/G]$ is equal to the stack of principal homogenous spaces for $G$. More generally, for any group scheme $G$, one has that $BG$ is equal to the stack of principal homogenous spaces if one expands the latter definition to include algebraic spaces with a locally trivial transitive $G$-action.

The proof of this theorem follows fairly immediately from descent for affine morphisms. Namely, a $G$-torsor is, definitionally, locally representable by a (relatively) affine scheme and thus, by descent, is globally representable. The smoothness condition is there for technical conditions.

## The case for the roots of unity

So, now if $n$ is invertible on $X$ we know that $\mu_n$ is a smooth (étale!) group scheme over $X$. Thus, the $\mu_n$-torsors are all the isomorphism sheaf of some principal homogenous space for $\mu_n$. Therefore, we’d like to have a nice description of these principal homogenous spaces.Just like the case of geometric line bundles, for which it is the line bundles (the sheaf of sections), which is simpler to understand, we will first understand the principal homogenous spaces for $\mu_n$ in terms of ‘coherent data’.

To this end, we consider pairs of the form $(\mathscr{L},\iota)$ where $\mathscr{L}$ is a line bundle on $X$ and $\iota$ is an isomorphism of $\mathcal{O}_X$-modules:

$\iota:\mathscr{L}^{\otimes m}\xrightarrow{\approx}\mathcal{O}_X$

We shall say that two pairs $(\mathscr{L}_1,\iota_2)$ and $(\mathscr{L}_2,\iota_2)$ are equivalent if there is an isomorphism $\varphi:\mathscr{L}_1\to\mathscr{L}_2$ such that the induced isomorphism $\varphi^{\otimes m}:\mathscr{L}_1^{\otimes m}\to\mathscr{L}_2^{\otimes m}$ carries $\iota_1$ to $\iota_2$.

Note that this set is naturally a group with operation given by

$(\mathscr{L}_1,\iota_1)\cdot(\mathscr{L}_2,\iota_2)=(\mathscr{L}_1\otimes\mathscr{L}_2,\iota_1\otimes\iota_2)$

where $\iota_1\otimes\iota_2$ is shorthand for the natural isomorphism

$\left(\mathscr{L}_1\otimes\mathscr{L}_2\right)^{\otimes m}\xrightarrow{\text{nat. isom.}}\mathscr{L}_1^{\otimes m}\otimes\mathscr{L}_2^{\otimes_m}\xrightarrow{\iota_1\otimes\iota_2}\mathcal{O}_X\otimes\mathcal{O}_X\xrightarrow{\text{nat. isom}}\mathcal{O}_X$

where the inverse is given by $(\mathscr{L},\iota)^{-1}=(\mathscr{L}^{-1},\iota^{-1})$ where $\iota^{-1}$ is shorthand for the natural isomorphism

$\left(\mathscr{L}^{-1}\right)^{\otimes m}\xrightarrow{\text{nat. isom.}}\left(\mathscr{L}^{\otimes m}\right)^{-1}\xrightarrow{\iota^{-1}}\mathscr{O}_X^{-1}\xrightarrow{\text{nat. isom.}}\mathcal{O}_X$

which one can check really do define group operations. The identity of this group is obviously $\left(\mathcal{O}_X,\iota_\mathrm{nat.}\right)$. Let us denote this group by $\mathcal{G}$.

Let us note that for the identity element $(\mathcal{O}_X,\iota_\mathrm{nat.})$ we can compute its automorphism sheaf $\mathrm{Aut}\left(\left(\mathcal{O}_X,\iota_\mathrm{nat.}\right)\right)$ on $X_\mathrm{\acute{e}t}$ given by

$(U\to X)\mapsto\mathrm{Aut}\left(\left(\mathcal{O}_U,\iota_\mathrm{nat.}\right)\right)$

where the automorphisms are automorphisms of pairs as defined above. We claim then that there is a natural isomorphism between this sheaf and $\mu_n$. Now, any automorphism of $\mathcal{O}_U$ corresponds to a global section $s$ of $\mathbf{G}_m(U)$. But, to respect the map $\iota$ we need precisely that $s^m=1$. Thus, we see the result.

Thus, from the general yoga of torsors we know precisely that $H^1(X,\mu_n)$, the $\mu_n$-torsors, correspond precisely to the twists of $(\mathcal{O}_X,\iota)$. More specifically, the $\mu_n$ torsors should correspond to pairs $(\mathcal{F},\iota)$ of a line bundle on $X_\mathrm{\acute{e}t}$ and a morphism $\iota:\mathcal{F}^{\otimes m}\to\mathcal{O}_X$ which is étale locally an isomorphism. But, from the section on the Picard group we know that this implies $\mathcal{F}$ is really a line bundle on $X$, and since $\iota$ is defined on $X_\mathrm{Zar}$ and is locally an isomorphism it is an isomorphism. Thus, we see that $H^1(X,\mu_n)$ is in bijective correspondence with $\mathcal{G}$.

So, now, what are the geometric structures associated to a pair $(\mathscr{L},\iota)$? Associated to such a pair an $X$-scheme which we will denote $\mathrm{Sp}(\mathscr{L},\iota)$. It will be a relatively affine $X$-scheme, and thus, to specify it, we must only specific a quasicoherent $\mathcal{O}_X$-algebra. This is the algebra defined by

$\mathcal{A}_{(\mathscr{L},\iota)}:=\mathcal{O}_X\oplus\cdots\oplus\mathscr{L}^{\otimes(m-1)}$

where the multiplication is defined in the usual way except we cycle powers past the $m^{\text{th}}$-power back into the range $[1,m-1]$ using the isomorphism $\iota$.

By construction, $\mathrm{Sp}(\mathscr{L},\iota)\to X$ is a finite affine map. We claim, in fact, that it is étale. Indeed, to see that this morphism is étale consider an affine open $U=\mathrm{Spec}(A)$ in $X$ with $\mathscr{L}\mid_U\cong \mathcal{O}_X$. Let $\varphi$ be an isomorphism between $\mathscr{L}\mid_U=\widetilde{M}$ (for some $A$-module $M$) with $A$. Let $\xi\in M$ be $\varphi(1)$. Then, one can check that $\mathscr{A}_{(\mathscr{L},\iota)}\mid_U\cong A[z]/(z^n-\xi)$ when defined correctly, and thus is étale over $U$.

Moreover, note that the $T$-points of $\mathrm{Spec}(\mathscr{L},\iota)\mid_U\to U$ (for a $U$-scheme $T$) is just $\{r\in \mathcal{O}_T(T):r^n=\xi\}$. Thus, we have a natural action of $\mu_{n,U}$ on $\mathrm{Sp}(\mathscr{L},\iota)\mid_U$. One can check that this action glues over the various $U$, thus providing an action of $\mu_{n,X}$ on $\mathrm{Sp}(\mathscr{L},\iota)$. Moreover, one can check that this actually makes $\mathrm{Sp}(\mathscr{L},\iota)$ a principal homogenous space for $\mu_n$.

Then, one claims that $(\mathscr{L},\iota)\mapsto \mathrm{Sp}(\mathscr{L},\iota)$ defines a bijection between $\mathcal{G}$ and the principal homogenous spaces over $\mu_n$. This follows fairly formally from the observation we have already made that $\mathcal{G}$ is in bijective correspondence with $H^1_\mathrm{\acute{e}t}(X,\mu_n)$ together with Theorem 2.

Finally, let us remark that under the identification of $H^1_\mathrm{\acute{e}t}(X,\mu_n)$ with $\mathcal{G}$ we can naturally interpret the short exact sequence in Theorem 1. Namely, the map $\mathcal{G}\to\mathrm{Pic}(X)[m]$ is merely $(\mathscr{L},\iota)\mapsto\mathscr{L}$. It’s clear that this map is surjective. Now, suppose that $(\mathscr{L},\iota)$ is in the kernel of this map. Then, $\mathscr{L}\cong\mathcal{O}_X$. But, evidently all the maps $\iota$ on $\mathcal{O}_X$ correspond to multiplication by an element of $\mathcal{O}_X(X)^\times$, and two such maps define isomorphic pairs if and only if they differ by an $n^\text{th}$ root of of $\mathcal{O}_X(X)^\times$. Thus we recover the sequence from Theorem 1 in this more geometric context:

$1\to \mathbf{G}_m(X)^\times/\mathbf{G}_m(X)^\times\xrightarrow{s\mapsto (\mathcal{O}_X,m_s)}\mathcal{G}\xrightarrow{(\mathscr{L},\iota)\mapsto\mathscr{L}}\mathrm{Pic}(X)[m]\to 1$

where $m_s$ is the multiplication by $s$ map.

# Some applications

Let us now discuss some nice cases in which this theorem can be applied. In particular, let us  examine the extreme cases where one of the outer terms in the sequence in Theorem 1 is zero. We shall make the blanket assumption that $n$ is invertible on $X$, and that it is connected.

## Case 1

First, let’s assume that $\mathbf{G}_m(X)/\mathbf{G}_m(X)^m=0$. The prototypical example we will consider is when $X$ is an integral proper variety over an algebraically closed field $k$ whose characteristic is coprime to $n$. Then,

$\mathbf{G}_m(X)/\mathbf{G}_m(X)^n=k^\times/(k^\times)^n=0$

Moreover, since $\mu_n\cong \mathbb{Z}/n\mathbb{Z}$ we may conclude from Theorem $1$ that

$\mathrm{Hom}_\mathrm{cont.}(\pi_1(X,\overline{x}),\mathbb{Z}/n\mathbb{Z})=H^1_\mathrm{\acute{e}t}(X,\mathbb{Z}/n\mathbb{Z})=\mathrm{Pic}(X)[n]$

Moreover, since the map $\mu_{\ell^{n+1}}\to \mu_{\ell^n}$ is multiplication by $\ell$ this corresponds on cohomology to the map $\mathrm{Pic}(X)[\ell^{n+1}]\to \mathrm{Pic}(X)[\ell^n]$, the multiplication by $\ell$ map. Thus, in fact, we see that

$H^1_\mathrm{\acute{e}t}(X,\mathbb{Z}_\ell)=\varprojlim \mathrm{Pic}(X)[\ell^n]$

where the tranistion maps on the right are the multiplication by $\ell$ maps.

Let us now consider some specific examples:

• Let $C/k$ is an integral smooth projective curve of genus $g$. Indeed, as is the usual proof, we know that if $J$ is the Jacobian of $C$ then this is an abelian variety over $k$ of dimension $g$, and $\mathrm{Pic}(C)\cong \mathbb{Z}\times J(k)$. Thus,

$H^1_\mathrm{\acute{e}t}(C,\mathbb{Z}/\ell^n)=\mathrm{Pic}(X)[\ell^n]=J(k)[\ell^n]=(\mathbb{Z}/\ell^n)^{2g}$

and thus we see that

$H^1(C,\mathbb{Z}_\ell)=\varprojlim H^1_\mathrm{\acute{e}t}(C,\mathbb{Z}/\ell^n\mathbb{Z})=\varprojlim \left(\mathbb{Z}/\ell^n\mathbb{Z}\right)^{2g}=\mathbb{Z}_\ell^{2g}$

• Similarly, if $A/k$ is an abelian variety, then we have the equality

$\displaystyle \pi_1(A,\overline{x})\cong \prod_{p\ne\mathrm{char}(k)}T_p A$

and thus we see that

$\displaystyle H^1(A,\mathbb{Z}/\ell^n\mathbb{Z})=\left(\prod_{p\ne\mathrm{char}(k)}T_p A\right)[\ell^n]=(\mathbb{Z}/\ell^n)^{2g}$

if $A=\dim g$. Thus, we can see that

$H^1_\mathrm{\acute{e}t}(A,\mathbb{Z}_\ell)=\mathbb{Z}_\ell^{2g}$

Then, using the fact that $H^\ast(A,\mathbb{Z}_\ell)$, with the comultiplication coming from the multiplication on $A$, is a connected cocommutative Hopf algebra it follows from the Hopf-Borel theorem that

$H^i(A,\mathbb{Z}_\ell)=\mathbb{Z}_\ell^{{2g}\choose{i}}$

• As a final example, we can make an interesting observation about Fano varieties. Namely, let us call a variety $X/k$, where now we assume that $k$ is of characteristic $0$, Fano if its canonical bundle is antiample. This means that if $\omega_X$ is the canonical bundle of $X$ over $k$, then $\omega_X^{-1}$ is ample. We claim that if $X$ is Fano, then $\mathrm{Pic}(X)$ is torsion-free.
$\text{ }$
Indeed, suppose that $\mathrm{Pic}(X)[m]$ is non-zero for some $m$. By standard group theory it suffices to assume that $m$ is $\ell^n$ for some prime $n$ Then, by the above we know that $H^1_\mathrm{\acute{e}t}(X,\mathbb{Z}_\ell)$ is non-zero. But, this implies (either by Hodge theory over $\mathbb{C}$ or $p$-adic Hodge theory) that $H^1(X,\mathcal{O}_X)$ is non-vanishing. That said, it follows from the ampleness of $\omega_X^{-1}$ and the Kodaira-Nakano-Akizuki vanishing theorem that $H^1(X,\mathcal{O}_X)=0$—contradiction.

Note also that our assumptions assure that all finite cyclic Galois covers $X'\to X$ are of the form $\mathrm{Sp}(\mathscr{L},\iota)$. Indeed, this follows from our explicit description of the map $\mathcal{G}\to\mathrm{Pic}(X)[m]$ from the last section (which is now an isomorphism!) and the fact that $\mathcal{G}\cong \mathrm{Hom}_\mathrm{cont.}(\pi_1(X,\overline{x}),\mathbb{Z}/m\mathbb{Z})$

## Case 2

Let us now consider the case when $\mathrm{Pic}(X)[m]=0$.

Theorem 1 then implies that we have an isomorphism

$H^1_\mathrm{\acute{e}t}(X,\mu_n)\cong \mathbf{G}_m(X)/\mathbf{G}_m(X)^n$

But, of course, we can’t make the connection between this result and $\ell$-adic cohomology unless we assume further that $X$ is actually a scheme over $\mathbb{Z}[\frac{1}{n}][\zeta]$ where $\zeta$ is a primitive $n^\text{th}$ root of unity. Then, we have the isomorphism

$H^1_\mathrm{\acute{e}t}(X,\mathbb{Z}_\ell)\cong \varprojlim \mathbf{G}_m(X)/\mathbf{G}_m(X)^{\ell^n}$

In particular, let’s think about the case when $X=\mathrm{Spec}(K)$ a field, then we know that

$H^1_\mathrm{\acute{e}t}(\mathrm{Spec}(K),\mu_n)=H^1(G_K,\mu_n(\overline{K}))=K^\times/(K^\times)^n$

So, if $K$ happens to have the $n^\text{th}$-roots of unity we see that

$\mathrm{Hom}_\mathrm{cont.}(G_K,\mathbb{Z}/n\mathbb{Z})=K^\times/(K^\times)^n$

But, of course, $\mathrm{Hom}_\mathrm{cont.}(G_K,\mathbb{Z}/n\mathbb{Z})$ correspond to the cyclic Galois extensions of $K$ cyclic of order $d\mid n$. In fact, the association, following the general geometric picture of last time, is just $a\in K^\times/(K^\times)^n$ maps to $K(\sqrt[n]{a})/K$ with Galois group $\mathbb{Z}/n\mathbb{Z}$.

More generally, assuming that $X$ is actually a scheme over $\mathbb{Z}[\frac{1}{n}][\zeta]$ we see again that all finite cyclic Galois covers of $X$ correspond to adjoining an $n^\text{th}$-root of a global section of $\mathcal{O}_X(X)$—generalizing ‘usual Kummer theory’.

# The weak Mordell-Weil theorem

We now begin moving towards the proof of the weak Mordell-Weil theorem: that if $K$ is a number field and $A/K$ is an abelian variety, then $A(K)/mA(K)$ is finite for any $m\geqslant 0$. But, before we actually get to the main meat of the proof, we’ll need to record some other facts of interest which will be useful in streamlining the exposition.

## The Hochschild-Serre spectral sequence

It will be useful for us to discuss the Hochschild-Serre spectral sequence. Roughly, this is a spectral sequence which allows us to compare the étale cohomology between the domain and codomain of a finite Galois cover $\pi:X'\to X$. Namely, we have the following:

Theorem 3(Hochschild-Serre spectral sequence): Let $\pi:X'\to X$ be a finite Galois cover with Galois group $G$. Then, for any abelian sheaf $\mathcal{F}$ on $X_\mathrm{\acute{e}t}$ there is a spectral sequence

$H^p(G,H^q(X',\mathcal{F}))\implies H^{p+q}(X,\mathcal{F})$

There are couple things which should be clarified here. First, the cohomology on the left is group cohomology. The action of $G$ on $H^q(X',\mathcal{F})$ is the action coming from the action of $G$ on $X'$. Finally, $\mathcal{F}$ on $X'_\mathrm{\acute{e}t}$ is just the restriction of $\mathcal{F}$ to $X'_\mathrm{\acute{e}t}$ which, since this is an étale cover of $X$, is nothing fancy.

This theorem will be useful in proving the weak Mordell-Weil theorem since, for example, the above shows that if $H^p(G,H^q(X_\mathrm{\acute{e}t}',\mathcal{F}))$ is finite for all $p+q=\ell$ then $H^\ell(X,\mathcal{F})$ is finite and, in fact,

$\displaystyle \left|H^\ell(X_\mathrm{\acute{e}t},\mathcal{F})\right|\leqslant \sum_{p+q=\ell}\left|H^p(G,H^q(X'_\mathrm{\acute{e}t},\mathcal{F})\right|$

which follows from the general machinery of spectral sequences (in fact, the definition of convergence).

This will allow us to essentially move to arbitrary finite Galois covers in our pursuit of showing that any particular étale cohomology group is finite.

Let us not prove this sequence exists, but give an indication about how one might about about proving it. Namely, note that since $\pi:X'\to X$ if finite Galois that for any sheaf $\mathcal{F}(X')^G=\mathcal{F}$ for any sheaf $\mathcal{F}$ on $X_\mathrm{\acute{e}t}$. Thus, we see that the global sections functor on $X_\mathrm{\acute{e}t}$ is nothing more than the composition

$\Gamma(X,-)=(-)^G\circ \Gamma(-,X')\circ \pi^{-1}$

Where $\pi^{-1}$ is the usual pullback functor. But, note that since $\pi^{-1}$ is exact, in the derived world we can ignore it.

Thus, we see that if we can show that $H^p(G,\mathcal{I}(X'))=0$ for any injective sheaf on $X$ and $p>0$ then from the general sledgehammer of the Grothendieck-Serre spectral sequence we’ll have

$R^q\left((-)^G\right)\left(R^p\Gamma(X'-)(\mathcal{F})\right)\implies R^{p+q}\Gamma(X,-)(\mathcal{F})$

which, upon rewriting using usual cohomological notation, is just the stated version of the Hochschild-Serre spectral sequence.

To prove that this condition holds, note that $H^p(G,\mathcal{I}(X'))$ is just the Cech cohomology of $\mathcal{I}$ for the étale cover $\{X'\to X\}$. But, the Cech cohomology of an injective on any cover is trivial, giving the result.

## Grothendieck’s lifting lemma

Another fact which we will need is that any abelian variety $A/K$ lifts to an abelian scheme $\mathscr{A}/U$ for some dense open $U\subseteq\mathrm{Spec}(\mathcal{O}_K)$. One can proceed by the general theory of Neron models over Dedekind schemes, but this is too strong here. We mention here a method which bypasses this.

First, begin by noting that $A/K$ lifts to some smooth projective scheme $\mathscr{A}/U$, for some dense open $U\subseteq\mathrm{Spec}(\mathcal{O}_K)$. To see this, choose a projective embedding of $A$ into $\mathbb{P}^n_K$. Then, by examining the equations defining $A$ as a subscheme, we can clearly define $A$ in $\mathbb{P}^n_V$ where $V$ is the open subsets obtained by removing the primes containing the coefficients (or just clear them out!). This might not actually be smooth over $V$. But, by removing the finitely many primes containing the determinant of the Jacobian we can shrink $V$ even further to obtain a smooth projective variety over some $U$ dense open in $\mathrm{Spec}(\mathcal{O}_K)$.

Moreover, note that the identity section $e:\mathrm{Spec}(K)\to A$ defines a point of $A(K)$ which, by the valuative criterion for properness (since $U$ is a Dedekind scheme!), lifts to a map $e:U\to \mathscr{A}$. We claim that this map is a section. But, this follows immediately from separatedness.

Thus, we see that for some dense open $U\subseteq\mathrm{Spec}(\mathcal{O}_K)$ we have a smooth projective scheme $\mathscr{A}\to U$ together with a section $e:U\to\mathscr{A}$ such that this pair restricted to $K$ is just $A/K$ with the identity section.

Thus, to finish we appeal to the following amazing theorem of Grothendieck:

Theorem 4(Grothendieck): Let $X\to S$ be a smooth projective morphism together with a section $e:S\to X$, such that $S$ is a connected locally Noetherian. Assume that for some geometric point $\overline{s}$ of $S$ the pair $X_{\overline{s}}\to \overline{s}$ and $e_{\overline{s}}:\overline{s}\to X_{\overline{s}}$ is an abelian variety, then $X\to S$ has the unique structure of an abelian scheme with $e$ its identity section.

The proof of this is surprisingly easy, and essentially produces by bootstrapping first from the infinitesimal case (when $S$ is a nilpotent to a thickening of $\overline{s})$ and the case of a DVR to the whole $S$. A full proof can be found on page 124 of Mumford’s Geometric Invariant Theory.

Thus, we see that our projective scheme $\mathscr{A}\to U$ actually has the unique structure of an abelian scheme with $e:U\to\mathscr{A}$ as its identity section.

## The proof of the weak Mordell-Weil theorem

So, let us now actually show that if $A/K$ is an abelian variety, where $K$ is a number field, then $A(K)/mA(K)$ is finite.

We begin by noting that, as shown in the last section, we have a dense open $U\subseteq\mathrm{Spec}(\mathcal{O}_K)$ and an abelian scheme $\mathscr{A}\to U$ lifting $A/K$. Moreover, by passing to an even smaller open inside of $U$, we may as well assume that $m$ is invertible on $U$.

So, begin by considering the sequence

$0\to\mathscr{A}[m]\to\mathscr{A}\to\mathscr{A}\to 0$

(which one can think about as a version of the Kummer sequence!) where the second map is the multiplication by $m$ map. This second map is clearly surjective since

So, take the long exact sequence in cohomology to obtain the piece

$0\to\mathscr{A}[m](U)\to\mathscr{A}(U)\to\mathscr{A}(U)\to H^1_\mathrm{\acute{e}t}(U,\mathscr{A}[m])$

In particular, we obtain an injection $\mathscr{A}(U)/m\mathscr{A}(U)\hookrightarrow H^1_\mathrm{\acute{e}t}(U,\mathscr{A}[m])$. But, since $\mathscr{A}\to U$ is proper, and $U$ is a Dedekind scheme, we know that $\mathscr{A}(U)=\mathscr{A}(K)=A(K)$ and thus we have an injection $A(K)/mA(K)\hookrightarrow H^1_\mathrm{\acute{e}t}(U,\mathscr{A}[m])$. Thus, we are reduced to showing that $H^1_\mathrm{\acute{e}t}(U,\mathscr{A}[m])$ is finite.

Next, we claim that it suffices to prove this result under the assumption that $U$ is a scheme over which $\mathscr{A}[m]$ becomes trivial. Indeed, let us note that since $\mathscr{A}[m]\to U$ is a finite étale group scheme, there exists some finite Galois cover $U'\to U$, say with Galois group $G$, such that $\mathscr{A}[m]\mid_{U'}$ is constant. Note then that by the Hochschild-Serre spectral sequence

$\displaystyle \left|H^1_\mathrm{\acute{e}t}(U,\mathscr{A}[n])\right|\leqslant \left|H^1(G,H^0_\mathrm{\acute{e}t}(U',\mathscr{A}[m]))\right|+\left|H^0(G,H^1_\mathrm{\acute{e}t}(U,\mathscr{A}[m])\right|$

But, $\mathscr{A}[m](U')=A[m](L)$ for some finite extension $L/K$ which is a finite group, and thus the group cohomology $H^1(G,A[m](L))$ is finite. Thus, it suffices to show that $H^1_\mathrm{\acute{e}t}(U',\mathscr{A}[m])$ is finite as desired.

By similar reasoning, appealing again to the Hochschild-Serre spectral sequence, we may assume that $U'$ is a scheme over $\mathbb{Z}[\frac{1}{n}][\zeta]$ where $\zeta$ is a primitive $n^\text{th}$-root of unity.

Now, $\mathscr{A}[m]\mid_{U'}$ is a product of constant group schemes of order dividing $2m$. Thus, it is a product of constant cyclic group schemes of the form $\underline{\mathbb{Z}/d\mathbb{Z}}$ for $d\mid 2m$. Thus, it suffices to show that $H^1_\mathrm{\acute{e}t}(U',\mathbb{Z}/d\mathbb{Z})$ is finite. But, since $U'$ is a scheme over $\mathbb{Z}[\frac{1}{n}][\zeta]$ Theorem 1 implies that we have a short exact sequence

$1\to\mathbf{G}_m(U')/\mathbf{G}_m(U')^d\to H^1_\mathrm{\acute{e}t}(U',\mathbb{Z}/d\mathbb{Z})\to \mathrm{Pic}(U')[d]\to 1$

But, all finite étale covers of $U$ are going to be the ring of $S$-integers in some finite extension $L$ of $K$. In particular, the general form of Dirichlet’s unit theorem and the finiteness of the class group implies that both of the outer terms of this sequence are finite (see here for a proof) and thus the middle term is finite as well.