p-divisible groups, formal groups, and the Serre-Tate theorem

In this post we discuss the basic theory of p-divisible groups, their relationship to formal groups, and the Serre-Tate theorem.


The need/desire for p-divisible groups

When one studies the arithmetic of elliptic curves (or, more generally, abelian varieties) there is an object of paramount importance which rears its head in almost all aspects of the theory. I speak, of course, of the \ell-adic Tate module of an elliptic curve E. Defined, as follows:

T_\ell(E):=\varprojlim E[\ell^n](\overline{K})

where E[\ell^n] is the kernel (a group scheme) of the multiplication map

[\ell^n]:E\to E

and the transition maps are the multiplication by \ell maps

\ell:E[\ell^{n+1}]\to E[\ell^n]

Note that T_\ell(E) is a \mathbb{Z}_\ell-module, but more important, is actually a \mathbb{Z}_\ell-representation of G_K since G_K acts compatibly on E[\ell^n](\overline{K}).

Now, the Tate module, despite its apparent ad hoc definition, contains a stupendous amount of information about E. As an example of this, we have the following amazing theorem of Néron-Ogg-Shafarevich (although perhaps more appropriately attributed to Serre and Tate):

Theorem(Néron-Ogg-Shafarevich): Let K be a p-adic local field, \ell\ne p a prime, and E/K an elliptic curve. Then, E has good reduction if and only if the G_K-representation T_\ell(E) is unramified.

Recall that E/K has good reduction if and only if there is an elliptic scheme \mathcal{E}/\mathcal{O}_K whose generic fiber is E. Thus, T_\ell(E) is able to detect the ability to lift E to an elliptic curve over the ring of integers!

Another incredible application of the Tate module is the following standard (but non-trivial) theorem of Tate:

Theorem(Tate’s isogeny theorem): Let E_1,E_2/\mathbb{F}_q be elliptic curves, and \ell a prime coprime to q. Then, E_1 and E_2 are isogenous if and only if T_\ell(E_1)\cong T_\ell(E_2) as G_{\mathbb{F}_q}-representations.

Thus, the Tate module captures the isogeny class of an elliptic curve over a finite field. Moreover, one can easily show that T_\ell(E_1)\cong T_\ell(E_2) if and only if E_1(\mathbb{F}_{q^r})=E_2(\mathbb{F}_{q^r}) for all r\geqslant 1, so the Tate module also captures the amount of points of an elliptic curve over all finite fields.

Remark: The ad hocness of the definition of T_\ell(E) can be lessened greatly by the observation that it is equal to the \ell-adic homology H_1(E,\mathbb{Z}_\ell): the dual of the \ell-adic cohomology group H^1(E,\mathbb{Z}_\ell). See here for a proof of this. This makes it the spiritual replacement for the lattice \Lambda=H_1(E,\mathbb{Z}) for an elliptic curve E=\mathbb{C}/\Lambda.

This point of view also immediately gives one direction (the easy direction!) of the Néron-Ogg-Shafarevich theorem that if E/K has good reduction then T_\ell(E) is unramified. This follows from the smooth proper base change theorem (Ehresmann’s theorem).

Now, you will notice that in all of the above applications of the Tate module we assumed that \ell was different than some prime p. There is good reason for this as illustrated by the following example. Let E/\mathbb{F}_q be a supersingular elliptic curve and q=p^s for some s. Note then that by definition of being supersingular E[p^n](\overline{\mathbb{F}_q})=0 for all n\geqslant 1, and so consequently T_p E=0. Thus, the p-adic Tate module of E is not sufficient to capture any information about E!

Remark: One can understand this issue in relation to the last remark. Namely, it’s still true that T_p(E) is the p-adic homology of E. But, as is often times discussed, \ell-adic cohomology for varieties over k is only a reasonable notion when \ell differs from the characteristic of k!

In fact, for any elliptic curve E/\mathbb{F}_q, the p-adic Tate module is extremely simple. We saw above that if E is supersingular then T_p(E)=0, and if E is ordinary then T_p(E) is \mathbb{Z}_p with the action of some character. Thus, this was not just some peculiarity of supersingular elliptic curves—it really is a deficiency of the p-adic Tate module in characteristic p.

One may then begin to wonder whether or not there is a way to remedy this situation. Namely, the p-adic Tate module is clearly problematic in characteristic p, but perhaps there is another like-minded ‘p‘-centric invariant which captures roughly the same information as T_\ell(E) for \ell\ne p.

But, before we go too far in the condemnation of the case \ell=p, let us note that for elliptic curves \mathcal{E}/\mathbb{Z}_p, the case \ell=p is actually capturing more information. Namely, note that (essentially as an application of smooth proper base change) one has that T_\ell(\mathcal{E}_{\mathbb{Q}_p})\cong T_\ell(\mathcal{E}_{\mathbb{F}_p}) in such a way which intertwines the quotient map G_{\mathbb{Q}_p}\to G_{\mathbb{F}_p}. Thus, T_\ell is not strong enough to differentiate between (the generic fibers of) different lifts of elliptic curves from \mathbb{F}_p to \mathbb{Z}_pThat said, note that by our above analysis, one has that T_p(\mathcal{E}_{\mathbb{Q}_p}) is never isomorphic to T_p(\mathcal{E}_{\mathbb{F}_p}). Thus, we see here that much more information is being retained in the case when \ell=p (with respect to lifts) than in the case when \ell\ne p.

So, one begins to wonder what is happening here? What about the case \ell=p causes these vast loss of information (resp. gain of information) in the case of elliptic curves over \mathbb{F}_p (resp. \mathbb{Z}_p)? The realization one makes is that the fatal inadequacy (resp. great boon) of T_p(E) for E an elliptic curve over \mathbb{F}_p (resp. \mathbb{Z}_p) is about the structure of the group scheme E[p^n]. Indeed, for \ell\ne p, the group scheme E[\ell^n] is étale, but for E[p^n] it is never étale. And, as a general rule of thumb (we will make this precise later), the geometric points of a non-étale finite flat group scheme do not capture much of the information whereas étale group schemes are entirely determined by their geometric points.

Thus, we see that, perhaps, until we went and took geometric points the objects E[\ell^n] and E[p^n] were equally fertile invariants of E. This suggests that perhaps we should try and replace T_p(E) by the system \{E[p^n]\} of finite flat group schemes. Now, this system of group schemes forms a natural inductive system:

E[p^n]\hookrightarrow E[p^{n+1}]

such that, definitionally, E[p^m][p^n]=E[p^n] for any n\leqslant m, and the order of E[p^n], as a group scheme, is p^{2n}.

This is, in a very real sense, the prototypical example of a p-divisible group. An inductive system of finite flat group schemes which makes the previous piece the torsion in the next. As indicated the goal of p-divisible groups is to be able to capture what something like the p-adic Tate module misses.

The sort of amazing thing is that it’s the entire system \{E[p^n]\} that captures something deep. This is not isolated to the case of elliptic curves. Indeed, in general, p-divisible groups \{G(n)\} enjoy properties (as a system) never to be expected for the individual groups G(n). They are, in some sense, very rigid.

Formal groups

Another part of the story, which we hope to explain, is the theory of formal groups. Now, formal groups are actually extremely easy to motivate, and if it wasn’t for the niceness of characteristic 0, they would most likely be a commonplace object in an undergraduate curriculum (and if one happened to learn Lie groups from Serre, they might already be!).

When, one studies the theory of Lie groups, there is an uncanny amount of attention focused on the Lie algebra \mathfrak{g} of a Lie group G. This is for good reason. This linearization of G contains a startling amount of information about G. The most rigorous statement of this fact is the well-known equivalence of categories

\left\{\begin{matrix}\text{Simply connected real}\\\text{Lie groups}\end{matrix}\right\}\longleftrightarrow\left\{\text{real Lie algebras}\right\}

which allows one to completely reduce the study of (simply connected) Lie groups to the study of what is essentially linear algebra.

Now, one is tempted to try and translate this huge success to the theory of algebraic groups. This has measured success in characteristic 0 (e.g. commutative unipotent groups are equivalent to finite-dimensional nilpotent Lie algebras) but falls flat on its face in positive characteristic. The reason stemming from the tiny fact which gives characteristic p much of its technical woes:

\displaystyle \frac{d}{dx}x^p=0

which causes the Lie algebra to act in unexpected/undesirable ways. In short, it doesn’t contain as much information as it ‘ought to’.

That said, we can remedy this situation by different means and obtain something which is the spiritual replacement for the Lie algebra. Namely, we think about the Lie algebra, the process of linearization, as ‘zooming in’ on the origin. But, in some sense, it ‘zooms too much’ so that in characteristic p it gives up so much information to become not useful. Thus, an idea is to create some sort of intermediary step

G\leadsto ???\leadsto \mathfrak{g}

which should retain enough information about G as to warrant study, but which is sufficiently simpler so as to be studyable.

This missing object is, in some sense, the niche that formal groups fill. Namely, let’s now think about an algebraic group G/k (where k is some field). Let us proceed towards zooming in on the origin of G, but let us not yet specify what level of zoom we want to take. Namely, we might imagine taking some neighborhood U of the origin of G. Now, this is not going to be a subgroup in general, but if we multiply the elements of U together, we land inside of U^2, another neighborhood of G. Thus, if we don’t actually specify which neighborhood we are looking at, instead considering it as some sort of system, then we really can multiply.

But, we want to always be thinking towards the origin, and so we want to consider functions on this system to be equal if they are equal on a sufficiently small neighborhood of the origin. Unfortunately, just zooming in this sense will only produce the stalk \mathcal{O}_{G,e} which, due to the deficiency of the Zariski topology, is not far enough. What we want is to zoom in ‘analytically close’ meaning that we want the space which is describing the locality of the origin e whose functions just remember the Taylor series of a function on a Zariski open of e.

One easily sees that what we’re describing with this process of (analytically) ‘zooming in’ is just taking the ‘completion’ of G at the origin. Or, in other words, of considering the space \widehat{G}=\mathrm{Spec}(\widehat{\mathcal{O}_{G,e}}). And, as the above intuitive argument shows, this space/scheme is still an algebraic object (=group scheme). One which is arbitrarily zoomed in on e.

Remark: To get a more rigorous idea of what type of ‘local information’ \widehat{\mathcal{O}_{G,e}} is capturing, take a look at this post.

It turns out that this object is actually a sufficiently rich object to capture much of the properties of G itself while being, in some sense, simple enough to actually have interesting theory. The theory of formal groups, in some sense, is just the natural framework in which to discuss objects like \widehat{G}.

Note that much of the above description for our desire to fix some issue in characteristic p, to replace a linear algebra object which was not sufficient, parallels our discussion in the last section. This is not a coincidence. In some sense, the exact part of the data missing in T_p(E) which is made up for by the use of p-divisible groups can be described in terms of formal groups.

Remark: Thinking towards formal groups also gives one a reasonable sense as to why we care about inductive systems of finite flat group schemes, opposed to projective systems. Namely, we think about the objects (G(n)) as capturing some amount of infinitesimal information near the identity of some object, and then the direct limit \varinjlim G(n) should be capturing ‘all infinitesimal information’ just like a formal group.

The Serre-Tate theorem

The last part of this post will be a discussion of the beautiful theorem of Serre and Tate. Before we state it rigorously, let us recall some basic terminology and goals from deformation theory.

The basic idea of deformation theory is to understand when an object ‘blah’ over some scheme lifts to an infinitesimal neighborhood of that scheme. And, when it doesn’t lift understand obstructions to the liftings, and when it does lift understand how different liftings relate.

By ‘blah’, we could mean any sort of object, and by an infinitesimal neighborhood we mean something like an infinitesimal embedding S_0\hookrightarrow S. In other words, a closed embedding whose ideal of definition \mathcal{I} is nilpotent.

As an example, one might take a smooth scheme X_0/\mathrm{Spec}(\mathbb{F}_p) and ask if there is a sufficiently nice (=flat, or equivalently smooth) X/\mathrm{Spec}(\mathbb{Z}/p^2\mathbb{Z}) such that X_{\mathbb{F}_p}=X_0. We see that we’re asking if X_0, a scheme over the point \mathrm{Spec}(\mathbb{F}_p), can be lifted (in a sufficiently nice way) to the ‘thickened point’ (i.e. same point, but with more ‘fuzz’) \mathrm{Spec}(\mathbb{Z}/p^2\mathbb{Z}).

There are two main reasons that one might want to perform this sort of ‘lifting’. The first is one of practical importance. Namely, suppose that we have an elliptic curve E/\mathbb{F}_p. Then, we might wonder whether or not we can lift it to an elliptic scheme \mathcal{E}/\mathbb{Z}_p—for example, so we might compare it to \mathcal{E}_{\mathbb{Q}_p} bridging the gap between characteristic 0 and p. Now, intuitively, \mathrm{Spec}(\mathbb{Z}_p) is the ‘union’ of the ‘thickened points’ \mathrm{Spec}(\mathbb{Z}/p^n\mathbb{Z}), and thus a first question is whether we can lift E to elliptic schemes over these fat points. In other words, we want to understand the deformation theory of E/\mathbb{F}_p.

But, the second reason is a much more conceptual answer. Namely, recall that if X/S is finite presentation, then the infinitesimal lifting criterion of Grothendieck says that X/S is smooth if and only if for all S-schemes \mathrm{Spec}(A) and all I (an ideal of A) with I^n=0 (for some n) the map

X_S(\mathrm{Spec}(A))\to X_S(\mathrm{Spec}(A/I))

is surjective.

Suppose for a second that X/S is some sort of moduli space. So, for an S-scheme T we have that

X_S(T)=\left\{\text{`blah' over }S\right\}/\text{iso.}

Then, we see that asking whether

X_S(\mathrm{Spec}(A))\to X_S(\mathrm{Spec}(A/I))

is surjective, is precisely asking whether we can (at least in the affine case) always deform ‘blah’s over infinitesimal neighborhoods. Thus, this gives us the following slogan

Deformation theory for ‘blah’s is the study of smoothness for the moduli space of ‘blah’s

or, more correctly, it’s the study of the formal smoothnes of the moduli space which, as we mentioned above, for finite presentation things is the same thing as smoothness.

Now, some of the most important moduli spaces in number theory are moduli spaces of abelian varieties with extra structure (i.e. Shimura varieties of abelian type). As a well-known example of this, the modularity theorem of Wiles (Conrad-Diamond-Taylor-Breuil) is really a statement about elliptic curves over \mathbb{Q} being covered by X_0(N), the (coarse) moduli space of elliptic curves with a subgroup of order N. As another example, all known proofs of the local Langlands conjecture for \mathrm{GL}_n rely on a study of the PEL Shimura varieties of \mathbf{GU}(1,n-1)-type.

Thus, it seems prudent to understand the smoothness of these moduli spaces. As a first step, we’d want to understand the smoothness of the ‘moduli space of abelian varieties’ in general (i.e. without extra structure). In other words, we want to understand when abelian varieties deform, and how their deformations relate.

Remark: A slightly technical point, and the reason for the quotes around moduli space of abelian varieties, is that there is no such moduli space—at least insofar as much as there is no scheme. Namely, one is then asking about the smoothness of a moduli stack. So, for example, deformations of unadorned elliptic curves (i.e. elliptic curves with no extra data) is proving that the stack \mathcal{M}_{1,1} is smooth.

Now, as we shall see, associated to any abelian variety A is a p-divisible group A[p^\infty]. Moreover, basically by functoriality, any deformation of some A_0/S_0 to a A/S is going to give a deformation of the p-divisible group A_0[p^\infty]—namely the p-divisible group A[p^\infty]. The amazing theorem of Serre-Tate says that the deformation of just A_0[p^\infty] to A[p^\infty] is essentially the entire content of the deformation A. Said less cryptically, and more conducive to remembering, one can summarize the theorem that

The deformation theory of an abelian schemes coincides with the deformation theory of its p-divisible group.

which is a handy slogan to keep in mind.

Not only is this theorem surprising (one only needs to care about deforming this tiny piece of A_0!), but extremely practically/theoretically useful. The biggest reason for this is the famed theorem of Grothendieck-Messing which, although a bit complicated to state, essentially says that deforming p-divisible groups is purely a linear algebra problem (it’s the same thing as deforming the Lie algebra in the Dieudonne crystal of the p-divisible group). Thus, by the Serre-Tate theorem, deforming abelian varieties is also just a linear algebra problem!

Another highfalutin reason to care about the Serre-Tate theorem is that it says (at least intuitively) that the formal local geometry of  Shimura varieties (roughly, moduli spaces of abelian varieties) is the same as the geometry of Rappoport-Zink spaces (moduli spaces of p-divisible groups). And, using the connection between connected p-divisible groups and p-divisible formal groups (discussed in the last section) we can even further the comparison to the geometry of Lubin-Tate spaces (the moduli space of formal groups).

Finally, we shall use the Serre-Tate theorem to explain how one’s linear algebra understanding of abelian varieties over \mathbb{C} (in terms of lattices/cohomology) might be extended to work over p-adic fields.

Finite flat group schemes

Definition and basic theorems

Before we begin in earnest we briefly recall some basic definitions and properties of finite flat group schemes. In particular, let S be a scheme, and G/S a group scheme. We call G/S a finite flat group scheme of order n if the structure morphism f:G\to S is locally free of rank n—recall this means that f_\ast\mathcal{O}_G is a vector bundle of rank n.

We call G/S commutative if it’s commutative as an abstract group scheme. All of our finite flat group schemes will be assumed commutative. We call G/S étale if the structure morphism is étale.

Remark: The apparent discord between the name ‘finite flat’ and the definition is resolved once one makes the following observation: G/S is finite flat if and only if f:G\to S is finite and flat in the case that S is Noetherian. Thus, this stronger-than-expected definition is only to take care of some annoying non-Noetherian situations. In particular, if S=\mathrm{Spec}(k) for a field k, then G/S is finite flat if and only if it’s finite in which case its order is \dim_k H^0(G,\mathcal{O}_G).

For a further discussion of this largely pedantic distinction, see this post.

Let us list some natural examples of finite flat group schemes.

  • Let S be any scheme, and let A/S be an abelian scheme of dimension g. The, for any N\in\mathbb{N} the group scheme A[N]/S is finite flat (and locally of finite presentation) of order N^{2g}. Indeed, the morphism

    [N]:A\to A

    is classically shown to be finite. One way to show this is to observe that since A is proper, then so is [N] and so it suffices to show that [N] is quasi-finite. But, this can be checked on geometric fibers. So, we may as well assume that S=\mathrm{Spec}(\overline{k}) for some field k. This result of finiteness there is then classic.

    To see that A[N]/S is flat, it suffices to show that [N] is flat. Suppose first that S is Cohen-Macaulay. Then A is Cohen-Macaulay and the fibers of [N] are zero-dimensional (as just proven) one may conclude from miracle flatness that [N] is indeed flat. To reduce to the case that S is Cohen-Macaulay one can use the fiberwise criterion for flatness.

    Finally, the order of A[N]/S is easily obtained to be N^{2 g} since this can be checked over geometric points of S where the result, once again, is classical.

  • Let S be any scheme, and consider the scheme \mu_n/S given by \underline{\mathrm{Spec}}(\mathcal{O}_S[T]/(T^n-1)) (where \underline{\mathrm{Spec}} denotes relative spectrum). This is easily verified to be finite flat of order n
  • Consider the constant group scheme \underline{A} where A is any finite abelian group. Then, \underline{A}/S is easily verified to be a finite flat group scheme of order |A| (this is the reason for the name ‘order’).
  • Let S be any group scheme such that p\mathcal{O}_S=0, where p is a prime. Then, one can define the group scheme \alpha_{p^n} by \underline{\mathrm{Spec}}(\mathcal{O}_S[T]/(T^{p^n})). One easily verifies that this is finite flat of order p^n.
  • Recall that, in general, the category of finite étale group schemes over S (a connected Noetherian scheme) are in correspondence with the finite continuous \pi_1(S,\overline{s})-modules. The examples of this to keep in mind are the following
    \text{ }
    First, if S=\mathrm{Spec}(k), then this just says that the étale group schemes over \mathrm{Spec}(k) are canonically corresponded with the continuous finite Galois representations G_k\to\mathrm{Aut}(A) (where A is some finite group given the discrete topology). Explicitly, an étale G/S corresponds to the G_k-module G(\overline{k}).If S=\mathrm{Spec}(R) for some Henselian local ring R, with residue field k, then the above just tells us that the étale group schemes G/S are in canonical correspondence with \pi_1(S,\overline{s})-modules. But, there is a canonical isomorphism \pi_1(S,\overline{s})=G_k and thus we see that the étale G/S correspond to the finite continuous G_k-modules. Explicitly, if G/S is an étale group scheme, then the associated G_k-module is G(R^\mathrm{ur}) (where R^\mathrm{ur} is the integral closure of R in the maximal unramified extension K^\mathrm{ur} of K:=\mathrm{Frac}(R)). This has an action of G_k since R^\mathrm{ur} does. So, for example if R=\mathbb{Z}_p the above says that finite étale group schemes G/R correspond to continuous finite \widehat{\mathbb{Z}}-modules.
  • Let us call a finite flat group scheme G/S of multiplicative type if étale locally on S they are of the form \mu_n.

Remark: For those unfamiliar with the fact that \pi_1(\mathrm{Spec}(R))=G_k, this identification comes from an equivalence of sites \mathsf{Fet}/R\to \mathsf{Fet}/k (the category of finite étale covers) given by sending X/R to X_k/k. An instructive, well-known example to keep in mind is the case when R=\mathbb{Z}_p. Then, this says that the unramified extensions of \mathbb{Q}_p correspond to the extensions of \mathbb{F}_p.

One can write entire books/hold entire courses on the theory of finite flat group schemes—in fact, they have (cf. the relevant article in Cornell-Silverman’s Arithmetic Geometry)! So, we state here only the facts about finite flat group schemes that are truly salient to our goals.

We begin with the following celebrated result of Deligne:

Theorem 1(Deligne): Let G/S be a finite flat commutative group scheme of order n. Then, n kills G. Said differently, the map [n]:G\to G factors through the unit section e:S\to G.

This theorem is mostly ‘tricky’ algebraic manipulation. That said, it must be somewhat deep since the non-commutative analogue is still, as far as I know, open! See Theorem 6 of this set of notes for a proof.

This allows us to prove the theorem of principal philosophical interest to us:

Theorem 2: Let G/S be a finite flat group scheme of order n. If n is invertible in S, then G/S is étale.

Proof: Since G\to S is already assumed flat, we need only show that \Omega^1_{G/S}=0. Consider the factorization of the map [n] as

G\xrightarrow{f}S\xrightarrow{e} G

Then, we know that the induced map


factors as


and thus is the zero map. That said, note that for each geometric point \overline{s} of S the map

(\Omega^1_{G/S})_{\overline{s}}\to (\Omega^1_{G/S})_{\overline{s}}

is just multiplication by n which is an isomorphism. This implies that \Omega^1_{G/S}=0 as desired. \blacksquare

Cartier duality

In this section we discuss Cartier duality which is a duality theory for finite flat group schemes (i.e. an involutive endofunctor). Let us begin by giving the abstract definition of such an object, and then explain what it looks like in the cases of most interest to us.

So, let G/S be a finite flat group scheme. We can then create a sheaf of abelian groups on S_{\mathrm{fppf}} (the big fppf site of S), denoted G^\vee, as follows:


(i.e. G^\vee is the sheaf hom \mathcal{H}om(G,\mathbf{G}_m) in the category of abelian fppf sheaves) where \mathsf{GrpSch}/T is the category of group schemes over T. One can show that, in fact, this functor is representable.

Indeed, since G\to S is finite, thus affine, we know that G=\underline{\mathrm{Spec}}(\mathcal{A}) for some quasi-coherent \mathcal{O}_S-algebra \mathcal{A}. Consider then the algebra


Now, note that, in general, the dual of an algebra isn’t an algebra, but note that the multiplication operator on \mathcal{A} is a map


and the co-multiplication, coming from the group structure of G is a map


Dualzing this gives

\mu^\vee:\mathcal{A}^\vee\to \mathcal{A}^\vee\otimes\mathcal{A}^\vee



Moreover, one can check that these two operations give \mathcal{A}^\vee the structure of a Hopf algebra, and so \underline{\mathrm{Spec}}(\mathcal{A}^\vee) the structure of a group scheme. One can then show that G^\vee is represented by \underline{\mathrm{Spec}}(\mathcal{A}^\vee).

Now, note that since the push-forward of the structure sheaf of G^\vee is \mathcal{A}^\vee that G^\vee is locally free of the same order as G. Indeed, if \mathcal{A}\mid_U=\mathcal{O}_S^n then, of course, \mathcal{A}^\vee\mid_U=\left(\mathcal{O}_S^n\right)^\vee=\mathcal{O}_S^n (as \mathcal{O}_S-modules).

We call this finite flat group scheme G^\vee the Cartier dual of G. It’s clear from the presentation of G^\vee as the dual of the Hopf algebra \mathcal{A} that (G^\vee)^\vee=G, and thus Cartier duality is, in fact, a duality theory.

Let us now note that in the case when S=\mathrm{Spec}(k), for k a field, then this definition takes on a slightly nicer form. Indeed, G=\mathrm{Spec}(A) for some k-algebra A, and then G^\vee is just \mathrm{Spec}(A^\vee) where A^\vee is the dual k-space with the operations endowed by the duals of the multiplication and co-multiplication operators.

Now, let us give some examples of Cartier duality.

  • Let A,B/S be abelian schemes over S a scheme. Then, one can show that for any isogeny

    f:A\to B

    with dual isogeny

    f^\vee: B^\vee\to A^\vee

    that \ker f is a finite flat group scheme (similar to/follows from the case of f=[N]) with Cartier dual \ker f^\vee. Or, nicely displayed

    (\ker f)^\vee=\ker f^\vee

    This is usually shown by the Weil pairing, but there is a nice approach due to Serre. Namely, Serre shows that A^\vee represents the sheaf \mathcal{E}xt^1(A,\mathbf{G}_m) on the fppf site of S given by sending T to \mathrm{Ext}^1_{\mathsf{GrpSch}/T}(A_T,\mathbf{G}_{m,T}). Thus, from the exact sequence

    0\to \ker f\to A\to B\to 0

    we obtain the long exact sequence of of group schemes for each S-scheme T:

    0\to \mathrm{Hom}(B_T,\mathbf{G}_{m,T})\to \mathrm{Hom}(A_T,\mathbf{G}_{m,T})\to \mathrm{Hom}((\ker f)_T,\mathbf{G}_{m,t})\to \mathrm{Ext}^1(B_T,\mathbf{G}_{m,T})\to\mathrm{Ext}^1(A_T,\mathbf{G}_{m,T})

    But, note that


    since A,B are projective (and they must respect the identity). Thus we see that

    \mathrm{Hom}((\ker f)_T,\mathbf{G}_{m,T})


    \ker\left(\mathrm{Ext}^1(B_T,\mathbf{G}_{m,T})\to\mathrm{Ext}^1(A_T,\mathbf{G}_{m,T})\right)=\ker\left(f^\vee:B^\vee(T)\to A^\vee(T)\right)

    but this shows that

    (\ker f)^\vee(T)=\mathrm{Hom}((\ker f)_T, \mathbf{G}_{m,T})=\ker(f^\vee(T))

    which gives the desired result.

    In particular, since [N]^\vee=[N] we see that A[N]^\vee=A^\vee[N]. So, if A is self-dual, then A[N] is Cartier self-dual.

    This self-duality happens, for example, if A is a Jacobian and so, in particular, for an elliptic scheme.

  • One can show by explicit computation that \mu_n^\vee is \underline{\mathbb{Z}/n\mathbb{Z}} and so, by duality, \underline{\mathbb{Z}/n\mathbb{Z}}^\vee=\mu_n
  • From this, one can see that the finite flat group schemes of multiplicative type are precisely the dual schemes to the ones which are étale.
  • One can show that \alpha_{p} is self-dual. Be careful—the group schemes \alpha_{p^n} for n>1 are not self-dual.

Of course, the Cartier dual is a (contravariantly) functorial construction, and so to any map of finite flat group schemes f:G\to H we obtain a dual map f^\vee:H^\vee\to G^\vee which comes from dualizing the map \mathcal{A}_H\to\mathcal{A}_G or, alternatively, for the obvious map on their functor of points.

The connected-étale sequence

We have already defined what it means for G/k a finite flat group scheme to be étale, where k is some perfect field (we’ve defined it in general). We call it connected if the underlying scheme of G is connected. Because it’s so common, we’d like to explain an alternative pair of words that takes the place of étale/connected. These are the terms reduced/local.

To begin, let us note that G/k is étale if and only if it’s reduced. Why? Since G\to\mathrm{Spec}(k) is of relative dimension 0, it is étale if and only if it’s smooth. But, since it’s already flat, it’s smooth if and only if G_{\overline{k}} is regular. We claim this is the case if and only if G is reduced. Indeed, since k is perfect, we know that G is reduced if and only if G_{\overline{k}} is reduced. But, then if G_{\overline{k}} is reduced, then there exists some non-empty open U\subseteq G_{\overline{k}} which is regular. So, let p\in U(\overline{k}). Then, for any other closed point q we have an automorphism of G_{\overline{k}} taking p to q (translation!) and so q must also be regular. Thus, G_{\overline{k}} is regular at all closed points, and thus regular as desired.

Now, let’s see why G the phrase local makes sense as a replacement for connected. More specifically, if G=\mathrm{Spec}(A), then we claim that G is connected if and only if A is local. Indeed, since A is a finite k-module, we know it’s Artinian. So, it’s a product of local Artinian rings and so the result obviously follows.

Now, in some sense every finite flat group scheme G/k is built from a étale part and a connected part. This is codified in the following theorem:

Theorem 3: Let G/k be a finite flat group scheme. Then, if G^\circ denotes the connected component of G the quotient G/G^\circ exists and is étale.

We denote this quotient G^\mathrm{\acute{e}t} and thus we have the following connected-étale sequence:

0\to G^\circ\to G\to G^{\mathrm{\acute{e}t}}\to 0

In fact, this sequence exists regardless of whether k is perfect or not. But, if k is perfect, then this sequence actually splits!

One gets quite a lot of intuitive milage out of thinking of this sequence as the spiritual analogue of the sequence

0\to G^0\to G\to \pi_0(G)\to 0

where \pi_0(G) is the component group of a topological group G. In fact, this is not just intuitive, and can be made perfectly precise, but we don’t puruse this line of thinking here (see Proposition 5.48 b) of Milne’s notes on algebraic groups).

Remark: It is not very difficult to explicitly describe this étale quotient. Namely, one can show that if A is the coordinate Hopf algebra of G, then the compositum of any two étale subalgebras is an étale subalgebra. This allows one to take the maximum étale subalgebra, and then G^{\mathrm{\acute{e}t}} is just the corresponding quotient (in the category of finite flat group schemes).

Similarly, it’s not very difficult to describe the section of G\to G^\mathrm{\acute{e}t} when k is a perfect field. Namely, one can show that if G=\mathrm{Spec}(A), then A has a maximal reduced quotient A^\mathrm{red}. We then obtain a closed embedding G^\mathrm{red}\hookrightarrow G which one can show is the desired section.

This (split) allows us to perform some interesting calculations.

  • We claim that E/\overline{\mathbb{F}_p} is an ordinary elliptic curve, then E[p^n]\cong \underline{\mathbb{Z}/p^n\mathbb{Z}}\times \mu_{p^n}. To see this, we first consider the case when n=1. Begin by noting that that by definition, E[p](\overline{\mathbb{F}_p}) is non-empty. This implies that in the decomposition

    E[p]=E[p]^\circ\times E[p]^\mathrm{\acute{e}t}

    that E[p]^\mathrm{\acute{e}t}\ne 0. Thus, since all étale group schemes over algebraically closed fields are constant, we have that

    E[p]=\underline{A}\times E[p^n]^\circ

    But, note that E[p] is also not entirely étale. Indeed, else this would imply that the multplication by p map [p] is étale which is false—it induces the zero map on the tangent space. Thus, E[p]^\circ is also non-zero.

    But, since the order of E[p] is p^2 we may then conclude that each of E[p]^\circ and E[p]^\mathrm{\acute{e}t} are order p. In particular, A must be \mathbb{Z}/p\mathbb{Z}. And, since E[p] is Cartier self-dual (since E is self-dual) we see that

    \underline{\mathbb{Z}/p\mathbb{Z}}\times E[p]^\circ=\left(E[p]^\circ\right)^\vee\times \mu_p

    Then, comparing factors, we may conclude that E[p]^\circ=\mu_p.

    Now, for general n, we proceed by induction. We’ve proven the claim for n=1. Suppose the result is true for n, and consider n+1. By the exact same argument from before, we know that E[p^{n+1}] will decompose as \underline{A}\times (\underline{A})^\vee where A is a finite abelian group of some prime power order. But, since E[p^{n+1}]^\mathrm{\acute{e}t}[p^i] is just E[p^i]^\mathrm{\acute{e}t}=\mathbb{Z}/p^i\mathbb{Z} for 1\leqslant i\leqslant n, we may conclude that A=\mathbb{Z}/p^{n+1}\mathbb{Z} from where the conclusion follows.

  • Using similar techniques one can show that if E/\overline{\mathbb{F}_p} is a supersingular elliptic curve, then E[p] is an extension of \alpha_p by itself.Indeed, let E be supersingular. Note that if F:=\text{Fr}_{E/\overline{\mathbb{F}_p}} denotes the relative Frobenius map E\to E^{(p)} then \ker F is an order p subgroup scheme of E and so must be contained in E[p]. It is not either \underline{\mathbb{Z}_p} or \mu_p since the former would imply E[p] contains non-trivial geometric points and so would the latter (by self-duality).  Since there are only 3 group schemes over \overline{\mathbb{F}_p} of order p (most easily seen by Dieudonne theory) we conclude that \ker F=\alpha_p.
    \text{ }Now, consider Q:=E[p]/\ker F. Then, Q is also order p and by the same argument as above must be \alpha_p. Thus, this shows that E[p] is an extension of \alpha_p by itself. It turns out (again most easily seen using Dieudonne theory) that there are only 3 extensions of \alpha_p by itself: \alpha_p^2, \alpha_{p^2}, and a third which can be described in terms of Witt schemes. It’s evident that E[p] is not either of the first two. The first has  tangent space which is 2-dimensional,whereas T_e E[p]\subseteq  T_e E and T_e E is 1-dimensional. The second is not self-dual. Thus, E[p] must be this last group scheme (denoted W^2_2 in Witt scheme notation).

In fact, and this will be important later, one can show that if R is any complete local Noetherian ring, then G fits into a connected-étale sequence as above. The meaning of étale here is clear (it’s the usual meaning) and the same goes with connected, but it has slightly strange behavior here. Namely, for any G/R a finite flat group scheme we denote by G^\circ the identity component of G containing the image of the map e:\text{Spec}(R)\to G (note that \text{Spec}(R) is connected). Then, of course, G is connected if G=G^\circ. This is all standard, but the way that G^\circ reacts with base change is a little vexing. See, for example, this nice post of Brian Conrad.

Frobenius and Verschiebung

Now, for reasons that we shall see below, we will mostly be interested in finite flat group schemes over fields of positive characteristic. There we have the advantage of some canonically defined morphisms which we shall leverage in our quest in understanding our finite flat (or p-divisible) groups.

Let us recall that if S is a scheme such that p\mathcal{O}_S=0 (i.e. it’s a scheme over \mathbb{F}_p) then we have the canonically defined map F_S:S\to S which is the identity on the underlying space of S and which on sections sends a\mapsto a^p (i.e. the map on an affine open \mathrm{Spec}(A)\to\mathrm{Spec}(A) is induced by the usual Frobenius map A\to A). This is called the absolute Frobenius of S.

We can then form the so-called Frobenius twist X^{(p)} of an S-scheme X as described by the following Cartesian diagram:

\begin{matrix}X^{(p)} & \to & X\\ \downarrow & & \downarrow\\ S & \underset{F_S}{\to} & S\end{matrix}

Intuitively, if we imagine that X as being cut out by equations f_i=0 over S then X^{(p)} is cut out by the equations f_i^{(p)}=0 where f_i^{(p)} is the result of raising the coefficients of f to the p^\text{th} power.

Finally, we obtain the relative Frobenius map F=F_{X/S}:X\to X^{(p)} to be the Cartesian arrow associated to the pair of arrows X\to S (the structure map) and F_X:X\to X (the absolute Frobenius map on X). Intuitively, on the level of points, F sends a tuple (x_i) to (x_i^p). This makes sense since if (x_i) satisfies the equation \{f_i=0\} then the tuple (x_i^p) should satisfy \{f_i^{(p)}=0\}.

Now, we are interested in the case of group schemes, and so we’d hope that this theory plays nicely with them. Indeed, notice that if G/S is a group scheme, then G^{(p)}/S, being the pull-back of G/S, is also a group scheme. Moreover, if G/S is finite flat then the same is true of G^{(p)}. Moreover, note that the map F:G\to G^{(p)} is actually a map of group schemes. This can be checked explicitly using the points description of the relevant objects.

Let’s do some examples to see what this object looks like in some of the cases we’ve discussed:

  • Let’s first consider the case G=\underline{A}—a constant group scheme. Note then that as a scheme G is just \displaystyle \bigsqcup_{\alpha\in A}S and the structure map is just the map coming from the identity maps S\to S. Thus, one can pretty easily see that G^{(p)}=G and that F:G\to G^{(p)} is just the identity map.
  • Let’s consider a more interesting example. Namely, let’s take a look at G=\mu_{p^n}. Again, since the equations defining \mu_{p^n} are unaffected by being raised to the p^\text{th} power (they’re just 1!) abstractly G^{(p)}. But, this time, one can just check by hand that F:\mu_{p^n}\to \mu_{p^n} is the multiplication by p map [p]. In particular, \ker F=\mu_p.
  • For G=\alpha_{p^n}, once again, the defining equations have coefficients in \mathbb{F}_p and so are unaffected by the Frobenius. Thus, abstractly, G^{(p)}=\alpha_{p^n} but, again, one can check that F:\alpha_{p^n}\to \alpha_{p^n} is the zero map.
  • Let us now consider what happens with the relative Frobenius for an abelian scheme A/\mathbb{F}_q, where q is a power of p. Note that A^{(p)}, still being proper and smooth over \mathbb{F}_q, is still an abelian variety. In general A and A^{(p)} will not be isomorphic over \mathbb{F}_q but, in general, they will be isogenous. Why? Because F:A\to A^{(p)} is an isogeny. Indeed, it suffices to show that F has finite kernel. But, since A/\mathbb{F}_q is smooth of relative dimension n:=\dim(A) over \mathbb{F}_q one knows from general theory that F is finite of degree p^n.  Thus, we see that A is ordinary/supersingular if and only if A^{(p)} is. In fact, they have the same p-rank (cf. Mumford’s book).
  • Finally, let’s consider the elliptic curve E over \overline{\mathbb{F}_{11}} given by y^2=x^3+\zeta where \zeta is a primitive 11^2-1 root of unity. Then, E^{(p)} is given by y^2=x^3+\zeta^p. Now, one can check that these elliptic curves are not isomorphic, even geometrically (their j-invariants are 2\zeta+8 and 9\zeta+5 respectively). A quick check (either by hand or using Sage) shows that they are both, indeed, ordinary. Of course since F is not an isomorphism it must have some kernel. By the observation in the previous bullet we know that \ker F\subseteq E[p] and, since \ker F has no geometric points, we may conclude that \ker F=E[p]^\circ. This is indicative of the general case.

Now, one thing we saw in the above examples is that some of F‘s finer properties seem to depend on whether G is étale or not. This is not a coincidence. Indeed, note that \ker F is connected since \ker F(\overline{\mathbb{F}_q}) is trivial. Thus, certainly if G is étale we expect F to have no kernel. We can be much more explicit though:

Theorem 4: Let S be a perfect scheme and X a flat S-scheme. Then, X/S is étale if and only if F is an isomorphism.

Proof: Suppose first that G/S is étale. Then, G/S is smooth of relative dimension 0. Then, it follows from Theorem 3 of this post that F is an isomorphism. Conversely, suppose that F is an isomorphism. Then, F^\ast\Omega^1_{X^{(p)}/S}\to \Omega^1_{X/S} is an isomorphism. But, this is the zero map (see the previous post), and thus \Omega^1_{X/S}=0. \blacksquare

Remark: With a little more work one can remove the flatness condition on X/S, but since we’ll be dealing primarily with flat schemes this is not important.

Now, the Frobenius map will not be the only important map to us. There is a map G^{(p)}\to G, for G/S a finite flat group scheme, of equal importance which is, in some sense, ‘dual’ to Frobenius. We can go about creating it as follows.

Consider first the Frobenius map F_{G^\vee/S}:G^\vee\to (G^\vee)^{(p)} on the Cartier dual of G. Consider then the Cartier dual of this map which will be a map

\left(\left(G^\vee\right)^{(p)}\right)^\vee\to G

Now, one can easily check that Cartier duals and Frobenius twists commute, and so this first term is just


and thus we have created a map G^{(p)}\to G which is called the Verschiebung map and is denoted V=V_{G/S}.

To verify some of the natural properties of V, especially with relation to F, it is useful to have a more down-to-earth description of V. To this end, let us note that if S=\mathrm{Spec}(A_0) and G=\mathrm{Spec}(A) then G^{(p)} is just the coordinate ring of the Hopf algebra A\otimes_{A_0}A_0 (where A_0 acts on itself by Frobenius). One can then verify that G\to G^{(p)} corresponds, on the level of coordinate rings, to the composition

A\otimes_{A_0}A_0\to \mathrm{Sym}^p(A)\to A

where \mathrm{Sym}^p(A) is the p^\text{th} symmetric power of A, the map A\otimes_{A_0}A_0\to\mathrm{Sym}^p(A) sends x\otimes a to the class of x(a\otimes \cdots\otimes a) (with p tensors), and the map \mathrm{Sym}^p(A) is the one induced by the multiplication map A^{\otimes p}\to A.

One can check that the map A\otimes_{A_0}A_0\to \mathrm{Sym}^p(A) is A_0-linear, and thus we can take the above for the A_0-linear dual A^\vee of A, and then dualize it to obtain the Verschiebung for A. We then see that the Verschiebung comes from the compositum

A\to (A^{\otimes p})^{S_p}\to A\otimes_{A_0}A_0

where the first map is the obvious one, and the second one is the unique algebra map taking x(a\otimes\cdots\otimes a) to x\otimes a).

Using this characterization it is not difficult to check the following:

Theorem 5: Let G/S be a finite flat group scheme. Then, V\circ F=[p]_G and F\circ V=[p]_{G^{(p)}}

One can check this on affines and just check that the above composition (in the explicit form) gives the desired result.

p-divisible Groups

Basic definitions and examples

There are, historically, two definitions of p-divisible groups. The first, given by Tate, goes something like the following. We say that an inductive system (G(n))_{n\in\mathbb{N}} of group schemes over S is a p-divisible group of height h if each G(n) is finite flat of order p^{nh}, and for all n\geqslant 1 the sequence

1\to G(n)\to G(n+1)\xrightarrow{p^n}G(n+1)

is exact. In other words, the map G(n)\to G(n+1) identifies G(n) with the p^n-torsion in G(n+1). We will often times denote p-divisible group (G(n)) as G, and denote the group scheme G(n) as G[p^n] for reasons to be made clear below.

The above is certainly very concrete, and is certainly what more naturally presents itself in practice, but it slightly dissatisfying. Namely, the p-divisible group G is really not one object as the name suggests, but a collection of objects. For this reason, some might prefer the more modern definition of a p-divisible group due to Grothendieck. Namely, we call an abelian sheaf G on S_{\mathrm{fppf}} a p-divisible group of height h if

  1. G is p-divisible—here this means that the multiplication map [p]:G\to G is an epimorphism of abelian sheaves on the fppf site.
  2. G is p-power torsion—here this means that

    G=\varinjlim G[p^n]

    where G[p^n] is the subsheaf given on points as \ker([p^n]:G(T)\to G(T)).

  3. Finally, we require that G[p] is a finite flat group scheme of order p^h.

One can show that given a p-divisible group in the first sense that G(T)=\varinjlim G(n)(T) gives a p-divisible group in the second sense. Conversely, given a p-divisible group in the second sense one can show that (G[p^n]) is an inductive sequence of group schemes which gives a p-divisible group in the first sense.

One sometimes sees the phrase Barsotti-Tate group in lieu of p-divisible group, giving homage to the forefathers of the field. For this reason, we will often times denote the category of p-divisible groups (whose morphisms are just morphisms of inductive systems/fppf sheaves) by \mathrm{BT}_p(S).

Now, let us give some examples of p-divisible groups:

  • Let A/S be an abelian scheme of relative dimension g. Then, the obvious inductive system (A[p^n]) forms a p-divisible group of height 2g which is usually denoted A[p^\infty].
  • Consider the system G(n)=\underline{\mathbb{Z}/p^n\mathbb{Z}} with the obvious transition maps. This forms a p-divisible group of height 1. We will often times denote this p-divisible group as \mathbb{Q}_p/\mathbb{Z}_p since this is precisely what it is when thought about as an fppf sheaf.
  • Consider the system G(n)=\mu_{p^n} with the obvious transition maps. This gives a p-divisible group of height 1 which is usually denoted \mu_{p^\infty}.
  • What we discussed in the last section shows that for E/\overline{\mathbb{F}_p} an ordinary elliptic curve, we have that E[p^\infty]=\mu_{p^\infty}\times(\mathbb{Q}_p/\mathbb{Z}_p).

The usual way of making p-divisible groups (although not the constant one!) comes from the following theorem:

Theorem: Let G/S be a commutative algebraic group (group scheme of finite type) and suppose that [p]:G\to G is locally free of rank p^h. Then, G[p^\infty]:=(G[p^n]) is a p-divisible group of height h.

We shall call a p-divisible group G étale if G[p^n]/S is étale for all n. We define the notion of connectedness for G similarly. As an example of these concepts, we see that if (p,\mathrm{char}(k))=1 and A/k is an abelian variety, the A[p^\infty] is étale. Whereas \mu_{p^\infty}, thought of as a p-divisible group over \mathbb{F}_p, is connected.

The Tate module

Associated to any p-divisible group G there is a projective system of finite flat group schemes G[p^n] with transition maps [p]:G[p^{n+1}]\to G[p^n].

Now, let us assume that S is connected, and let’s choose a geometric point \overline{s} of S. Then, to the projective system of group schemes G[p^n] we can create the projective system of continuous \pi_1(S,\overline{s})-modules by taking the stalks of the étale sheaves G[p^n]. We can then take the inverse limit of this projective system of \pi_1(S,\overline{s})-modules to obtain a continuous \pi_1(S,\overline{s})-module denoted T(G). We call this the Tate module of G.

As an example of this, note that if A/k is an abelian variety, then T(A[p^\infty])=T_p(A)—the usual Tate module. This also shows us that T(G) is a not particularly useful invariant in some situations. For example, as we mentioned before, if E/\mathbb{F}_p is a supersingular elliptic curve, then T(E[p^\infty])=0! That said, there is one situation in which the Tate module tells all:

Theorem 6: The association G\leadsto T(G) defines an equivalence

\left\{\begin{matrix}\acute{\mathrm{e}}\text{tale }p\text{-divisible}\\\text{groups }G\end{matrix}\right\}\longleftrightarrow\mathsf{Rep}_{\mathbb{Z}_p}(\pi_1(S,\overline{s}))

where here the right hand side denotes the category of continuous \pi_1(S,\overline{s}) representations on finite free \mathbb{Z}_p-modules—the height of G corresponds to the rank of T(G).

This follows almost immediately from the equivalence of categories between finite étale G/S and finite \pi_1(S,\overline{s})-modules (you can see a discussion of this in this post).

Remark: For those that like to think in this language, we can nicely phrase the above in the context of lisse sheaves. Namely, what we’ve done is associated to any p-divisible group (G(n)) a lisse \mathbb{Z}_p sheaf (G(n)) (with transition maps multiplication by p), and then T(G) is just the standard \pi_1(S,\overline{s})-representation associated to this lisse \mathbb{Z}_p-sheaf. So, perhaps, more naturally the above shows that if étale p-divisible groups are equivalent to lisse \mathbb{Z}_p-sheaves, and then the equivalence of this latter category with \mathsf{Rep}_{\mathbb{Z}_p}(\pi_1(S,\overline{s})) is well-known.

From this, we deduce the following corollary:

Theorem 7: Let p be invertible on S. Then, G\leadsto T(G) defines an equivalence


This follows from the previous theorem and Theorem 2.

In particular, we see that if p differs from the characteristic of the field k, then p-divisible groups over k are essentially just Galois modules. Thus, the philosophical takeaway is that we created p-divisible groups to access something Galois representations were not rich enough to encapsulate. But, we are really only gaining more when p=\mathrm{char}(k) since, in any other case, p-divisible groups contains precisely the information of the natural Galois representation associated to them.

Adaptations from finite flat group schemes

Much of what we discussed above for finite flat groups schemes carries through, essentially verbatim, for p-divisible groups. So, we list here, just for completeness these adaptations.

Cartier duality

Let’s suppose that we have a p-divisible group G. Then, just as in the section on Tate modules, we have the projective system [p]:G[p^{n+1}]\to G[p^n]. By taking the Cartier duals of these maps, we obtain an inductive system [p]^\vee:G[p^n]^\vee\to G[p^{n+1}]^\vee. One can check that this is, in fact, a p-divisible group. We denote it by G^\vee. Not shockingly, as per usual, (G^\vee)^\vee=G.

Let us list some examples:

  • As one might expect \mu_{p^\infty}^\vee=\mathbb{Q}_p/\mathbb{Z}_p and so by duality (\mathbb{Q}_p/\mathbb{Z}_p)^\vee=\mu_{p^\infty}.
  • If one considers the p-divisible group A[p^\infty] then (A[p^\infty])^\vee=A^\vee[p^\infty].

Connected-étale sequence

Let’s suppose that G is a p-divisible group over the Henselian local ring R. Then, we can define the connected component of G, denoted G^\circ, to be the p-divisible group G^\circ[p^n]=G[p^n]^\circ (which one can check really does define a p-divisible group). Then, just as in the case of finite flat group schemes, we have a short exact sequence

0\to G^\circ\to G\to G^\mathrm{\acute{e}t}\to 0

where G^{\mathrm{\acute{e}t}} is étale.

And, just as before, if R=k is a perfect field, then this sequence actually splits.

Frobenius and Verschiebung

Here, again, we can almost adapt the notions from finite flat group schemes to p-divisible groups by just phrasing things in terms of constituents. Namely, given G=(G(n)) a p-divisible group over S (where p\mathcal{O}_S=0) we can form its Frobenius twist G^{(p)} to just be the p-divisible groups with terms G(n)^{(p)}. Functoriality checks that we still actually have an inductive system on the G(n)^{(p)}, and a little thought shows that it is still a p-divisible group.

We then have the relative Frobenius map F:G\to G^{(p)}, which is a map of p-divisible groups, by defining it term-wise. Similarly, we can construct the Verschiebung V:G^{(p)}\to G. And, as expected, V\circ F=[p]_G and F\circ V=[p]_{G^{(p)}}.

The rigidity of p-divisible groups

Now, before we move on to formal groups, I feel like I’d be remiss to not justify my claim that p-divisible groups are more rigid (as alluded to in the motivation), and enjoy properties which finite flat group schemes don’t.

One of the main results of Tate’s original paper on p-divisible groups is the following:

Theorem (Tate): Let R be a complete Noetherian local domain with residue characteristic p, and K:=\mathrm{Frac}(R) of characteristic 0. Then, the functor

\mathrm{BT}_p(R)\to\mathrm{BT}_p(K):G\mapsto G_K

is fully faithful.

In particular, one can understand all of G (in particular its special fiber) by working in the much more comfortable category of p-divisible groups over the field K which, by Theorem 7 are just elements of \mathsf{Rep}_{\mathbb{Z}_p}(G_K).

There is a theorem which, while in the same vein, can’t hold a candle to Tate’s theorem power (although it’s extremely, extremely powerful in a difference sense):

Theorem(Raynaud): Let K/\mathbb{Q}_p be a finite extension with e(K/\mathbb{Q}_p)<p-1. Then, the functor G\mapsto G_K from finite flat group schemes over \mathcal{O}_K to finite flat group schemes over K is fully faithful.

Here e(K/\mathbb{Q}_p) denotes the ramification index. Note that this condition is necessary as the following example shows. The finite flat group schemes \mu_p and \underline{\mathbb{Z}/p\mathbb{Z}} are isomorphic over K=\mathbb{Q}_p(\mu_p) (which has e=p-1!) but not as finite flat group schemes over \mathcal{O}_K (one is étale and the other is not, as can be seen by reducing to the special fiber).

There is no restriction on the ramification index as in Tate’s theorem. For example, \mu_{p^\infty} and \mathbb{Q}_p/\mathbb{Z}_p are still not isomorphic over \mathbb{Q}_p(\mu_p). That said, one can easily construct a counterexample if one removes the Noetherian assumption in Tate’s theorem, as \mathbb{Z}_p[\mu_{p^\infty}] shows.

Formal groups

A lament

Before we begin with our discussion of formal groups, I’d like to make a complaint. It’s an unfortunate fact that formal geometry, and especially the theory of formal groups, is a very annoying subject with regards to literature. Namely, if one hands you ten different articles called “Introduction to Formal Groups”, one has likely been handed ten different definitions of the titular objects. Do you work with pseudocompact rings? Do you work with nilpotent rings? Do you work with complete adic rings? The list goes on.

For this reason, sometimes cross-checking a fact, or simultaneous reading become a difficult endeavor. I have no solution to this issue. On one hand it’s understandable why someone would not want to introduce the entire theory of pseudocompact/pseudofinite rings before one talks about formal groups. That said, making ad hoc definitions using things such as nilpotent algebras or completions of schemes seems equally unsatisfactory.

Perhaps the ‘right’ answer is to deal with everything in the language of what seems to be the modern winner of ‘popular rigid geometry’ which is adic spaces. This makes anyone familiar with this theory very happy but everyone else woefully, woefully unsatisfied.

I have opted to take the perspective of complete adic rings. This is close enough to the adic geometry to be easily translatable, but simple enough to not need a tome unto itself.

Formal schemes

We first need to recall the general setup of formal geometry: the theory of formal schemes.


One aspect of the theory of schemes which can be consternating is their lack of stereotypically nice categorical properties. In particular, the category of schemes is not ‘cocomplete’—it does not have all filtered colimits. So, one is sometimes forced to work in the category of ‘ind-schemes’ (formal direct limits of schemes).

One particular case where this issue comes up is when one wants to define the (analytic) completion of an integral closed subscheme Y of a scheme X. Namely, one wants to define, as a subscheme of X, the ‘analytic closed subscheme’ \widehat{Y}\subseteq X which should be the closed subscheme equipped with all differential data—tangent vectors in all directions, and of all ‘degrees’ (cf. jets).

Now, if X=\text{Spec}(A) is affine and Y is a point x, say corresponding to the maximal ideal \mathfrak{m}, then we already sort of have a good idea of what this might technically mean. Namely, we define the cotangent space T_x X to be the k(x)-vector space \mathfrak{m}/\mathfrak{m}^2 which we can think about as being the ideal \mathfrak{m} in the ring A/\mathfrak{m}^2. Thus, we might imagine that the ‘thickening’


is a space capturing ‘linear order’ differential data. Thus, perhaps, V(\mathfrak{m}^3) (which contains V(\mathfrak{m}^2) as a closed subscheme) is capturing ‘quadratic order’ differential data. Thus, perhaps, the closed subscheme capturing all differential data is, somehow, \varinjlim V(\mathfrak{m}^n).

Remark: For some more discussion on this topic see, as mentioned above, this post.

More generally, we might imagine that the closed subscheme of X which is ‘closed analytic neighborhood’ of Y in X might be \varinjlim V(\mathcal{I}^n) if \mathcal{I} is the ideal sheaf of \mathcal{O}_X cutting out the closed subscheme Y in X. Unfortunately, \varinjlim V(\mathcal{I}^n) might not exist as a scheme. Moreover, even when it does, it might not be what we’re really after.

Let us give an example. Consider the point (p)\in\text{Spec}(\mathbb{Z}). Then, we have that


with transition maps being the natural quotients. Then, as it turns out, the colimit

\varinjlim V((p)^n)=\varinjlim\text{Spec}(\mathbb{Z}/p^n\mathbb{Z})

exists in the category of schemes. Why? Well, the result ‘should’ be affine, and since the global sections of such a direct limit ‘should’ be the colimit of the underlying rings, we are tempted to guess that the colimit is \text{Spec}(\mathbb{Z}_p). Indeed, this is obvious in the category of affine schemes since \text{Spec} is an anti-equivalence, and it holds true in the category of all schemes by the ‘coincidence’ that since \text{Spec}(\mathbb{Z}_p) is local (has only one closed point) we have that

\displaystyle \text{Hom}(\text{Spec}(\mathbb{Z}_p),X)=\bigcup_{x\in X}\text{Hom}(\text{Spec}(\mathbb{Z}_p),\text{Spec}(\mathcal{O}_{X,x}))

sending f in the left hand side to the obvious element in the right hand side where f=f((p)).

But, this ‘closed analytic neighborhood’ of (p) should not be \text{Spec}(\mathbb{Z}_p). Why? Well, how should we picture this ‘closed analytic neighborhood’ of (p)? Well, it should capture no more points than (p) only enlarging the space V((p))=\text{Spec}(\mathbb{Z}/p\mathbb{Z}) by adding more functions—tangent spaces, and more generally jet spaces, cover no more points than the single point they’re based at. But, \text{Spec}(\mathbb{Z}_p) does not look like this ‘infinitely fuzzy point’ (the fuzz connoting the existence of extra functions), it has two points. In short, \mathbb{Z}_p is the ring of functions on this infinitely fuzzy point, but this fuzzy point is not the spectrum of this ring. Said differently, the ring \mathbb{Z}_p doesn’t ‘remember’ that  it came from completing \mathbb{Z} at (p), and so doesn’t capture the space of the infinitely fuzzy point.

OK, so how can we make the ring \mathbb{Z}_p remember that it came from the completion of \mathbb{Z} at (p)? Well, completing \mathbb{Z} at (p) gives more than just a ring, it gives a topological ring. Thus, one is tempted to think that perhaps its not just \mathbb{Z}_p, but \mathbb{Z}_p with its topology, that accurately reflects this infinitely fuzzy point. More, generally if we have the ambient affine space X=\text{Spec}(A) and closed subscheme Y=\text{Spec}(A/I), one might imagine that the completion can be faithfully described by the topological ring

\widehat{A}:=\varprojlim A/I^n

given the I-adic topology.

One litmus test for this is the following. We should be able to canonically recover the closed subscheme \text{Spec}(A/I) from whatever data this ‘closed analytic neighborhood’ contains. For \mathbb{Z}_p this is easy—it’s a local ring and its residue field is the ring of functions on the original closed subscheme V((p)). But, what about \widehat{A}? How can we canonically recover A/I from this? Well, there is no canonically associated ideal to this ring (it may not be local!) so just the ring theoretic data does not  seem enough to canonically recover A/I. But, the addition of the I-adic topology is enough. Namely, it can be checked that since we assumed that Y was integral (so it had no fuzz to begin with) that  A/I is \widehat{A}/\widehat{A}^{\circ\circ} where, borrowing from the language of adic spaces, \widehat{A}^{\circ\circ} is the ideal of topologically nilpotent elements of \widehat{A}. Thus, the topology is enough it seems.

Anyways, back to the original issue. We wanted to define a ‘closed analytic neighborhood’ of Y\subseteq X an integral closed subscheme. We guessed that the answer should be

\widehat{Y}:=\varinjlim V(\mathcal{I}^n)

where \mathcal{I} is the ideal sheaf of \mathcal{O}_X cutting out Y. Now, we said that this definition isn’t going to work purely in the category of schemes because colimits don’t generally exist. So a natural thing to then do is try to embed schemes into a larger cocomplete category and take the direct limit there.

Two natural choices present themselves to us: the dual category

\widehat{\mathsf{Sch}} := \mathsf{Func}(\mathsf{Sch}^{\text{op}},\mathsf{Set})

and the category \mathsf{LRS} of locally ringed spaces. Both are cocomplete, and have their individual pros and cons. The former is the most natural categorical choice since \widehat{\mathsf{Sch}} is the free cocompletion of \mathsf{Sch} (see Qiaochu Yuan’s post here) and so is the most ‘natural’ choice. The latter is nice because well, I like spaces and sheaves.

Now, there is a right answer and when one chooses it we will see that the above choice between \widehat{\mathsf{Sch}} and \mathsf{LRS} is actually not that important. But, we do need to make a slight modification. Namely, \mathsf{LRS} does gives the correct underlying locally ringed space of the ‘correct choice’ but it does not give the right morphisms between these objects.

The issue, as we intuited above, was that we need to be keeping track of the topologies that these completions naturally inherit. Thus, we don’t take the colimit \varinjlim V(\mathcal{I}^n) when we think of \mathsf{Sch} as a full subcategory of \mathsf{LRS} but when we think of it as a full subcategory of \mathsf{LTRS}locally topologically ringed spaces. This has as objects pairs (X,\mathcal{O}_X) where X is a topological space and \mathcal{O}_X is a sheaf of topological rings. The morphisms are then the obvious ones (where we require the morphisms between topological rings of sections to be continuous). We then have the inclusions


where we consider the rings in \mathsf{LRS} as discrete rings.

Thus, to get the ‘correct answer’ we take the comlimit \varinjlim V(\mathcal{I}^n) in the category \mathsf{LTRS}. This has the added benefit of not being different than the answer one gets when one takes the colimit in \widehat{\mathsf{Sch}}. Namely, if (\mathcal{X},\mathcal{O}_{\mathcal{X}}) denotes the colimit in \mathsf{LTRS} then


for any scheme (Z,\mathcal{O}_Z). This says that the equality

\text{Hom}_{\mathsf{LTRS}}(-,(\mathcal{X},\mathcal{O}_{\mathcal{X}}))=\varinjlim \text{Hom}_{\mathsf{Sch}}(-,(V(\mathcal{I}^n),\mathcal{O}_{V(\mathcal{I}^n)}))

in the category \widehat{\mathsf{Sch}}.

We then define a formal scheme to be, essentially, an element of \mathsf{LTRS} which locally looks like \widehat{Y} for some setup as above.

Technical definitions

So, let us begin our formalization of the above ideas with, as they must, some generalities on topological rings. This makes sense, after all, since our ultimate ambient category is the category \mathsf{LTRS} of locally topologically ringed spaces.

So, let us call a topological ring R preadic if there exists an open ideal I of R such that \{I^n\} forms a neighborhood basis of the origin. We call such an ideal I an ideal of definition. We call R a adic ring if it is, in addition, complete. This corresponds to the statement that R\cong \varprojlim R/I^n where each R/I^n is given the discrete topology.

To an adic ring R we associate a locally topologically ringed space \mathrm{Spf}(R), called the formal spectrum of R, as follows. As a set we declare that:

\mathrm{Spf}(R):=\left\{\mathfrak{p}\in\text{Spec}(R):\mathfrak{p}\text{ is open}\right\}

We topologize \mathrm{Spf}(R) by having a basis of closed sets be those of the form V(J) where, for an ideal J\subseteq R, we let V(J) denote those open primes containing J. Finally, we put a structure sheaf on \mathrm{Spf}(R) as follows. For any element f\in R we define

\mathcal{O}_{\mathrm{Spf}(R)}(D(f)):=\varprojlim A_f/I^n A_f=\widehat{A_f}

where, of course, D(f)=\mathrm{Spf}(R)-V(f) and each A_f/I^nA_f is given the discrete topology.

One can then verify that (\mathrm{Spf}(R),\mathcal{O}_{\mathrm{Spf}(R)}) is a locally topologically ringed space. Of course, we shall denote this pair by just \mathrm{Spf}(R) with the structure sheaf understood.

We define an affine formal scheme to be a locally topologically ringed space isomorphic to \mathrm{Spf}(R). A formal scheme is then a locally topologically ringed space \mathfrak{X} which is locally isomorphic to an affine formal scheme. We then define the category \mathsf{FmlSch} to be the full subcategory of \mathsf{LTRS} consisting of formal schemes.

Now, the first thing that one might want to verify about formal schemes, and their relationship to affine formal schemes, is the following:

Theorem 8: For any affine formal scheme \mathrm{Spf}(R) one has that \mathcal{O}_{\mathrm{Spf}(R)}(\mathrm{Spf}(R))=R as a topological ring. Moreover, for any locally topologically ringed space \mathfrak{X} (in particular for a formal scheme) the natural map


is a bijection.

Thus, in particular, we see that for any adic rings A and B we have that


functorially as we’d hope.

Let us note that any Noetherian discrete ring A is adic. We then claim that the morphism


is fully faithful from the category of affine schemes to affine formal schemes. Indeed, this follows immediately from Theorem 8 since any ring map between discrete rings is continuous. This extends, in the obvious way, to give an embedding


so that every scheme is naturally a formal scheme

Before we proceed with giving some basic properties of the category \mathsf{FmlSch} let us give some natural examples of formal schemes to keep in mind:

  • Consider the topological ring \mathbb{Z}_p. This is obviously adic and so we can consider the formal scheme \mathrm{Spf}(\mathbb{Z}_p). Since \mathbb{Z}_p has only one open prime, the prime (p), we see that \mathrm{Spf}(\mathbb{Z}_p) consists of a single point. The value of the structure sheaf on this point is, of course, \mathbb{Z}_p.
  • Let k be a field of characteristic not 2. Consider the nodal cubic X=\mathrm{Spec}(k[x,y]/(y^2-x^3-x^2) and the axes Y=\mathrm{Spec}(k[x,y]/(xy)). Now, intuitively, we picture that the singular point of the nodal cubic (the origin) ‘looks locally the same’ as the axes at the origin—they both locally look like ‘an X’. Of course, this does not happen in any Zariski neighborhood of the origin since the former is integral and the latter is not.That said, if we pass to the land of formal schemes, the zooming in at that point, we can make precise the fact that they are isomorphic. Namely, the completion of the local ring of the nodal cubic at the origin is k[[x,y]]/(y^2-x^3-x^2) and the completion of the axes at the origin is k[[x,y]]/(xy). Then, one can show that these complete local rings are isomorphic. Thus, while X and Y are not isomorphic in any neighborhood of the origin their ‘formal neighobhoods are the origin’ are: \mathrm{Spf}(\widehat{\mathcal{O}_{X,0}})\cong\mathrm{Spf}(\widehat{\mathcal{O}_{Y,0}}).
  • Consider the formal scheme \mathrm{Spf}(\mathbb{Z}[[T]]) where \mathbb{Z}[[T]] is given the (T)-adic topology. This formal scheme is thought about as being a formal geometry analogue of the open unit disk. Indeed, to formalize this, note that for any formal scheme \mathfrak{X} one has that


    where, for any topological ring A, we have that A^{\circ\circ} is the ring of topologically nilpotent elements.

    One can justify why this should mean that \mathrm{Spf}(\mathbb{Z}[[T]]) should be the disk very nicely in the context of adic spaces. Suffice it to say that the connection is rougly the observation that if K is a non-archimedean valued field then K^{\circ\circ} is precisely the open unit disk of elements with norm less than 1.

Let us end this section by observing that the category of formal scheme has all fiber products. To prove this one works, as in the case of schemes, first locally. There one shows that if R, S, and T and adic rings then one has an equality

\mathrm{Spf}(S)\times_{\mathrm{Spf}(R)}\mathrm{Spf}(T)=\mathrm{Spf}(S\widehat{\otimes}_R T)

where here \widehat{\otimes} denotes the completed tensor product defined as follows:

S\widehat{\otimes}_R T:=\varprojlim((S/I^n)\otimes_R (T/J^m))

if I and J are ideals of definition of S and T.

The formal scheme associated to a closed subset

Let us now describe how to take the completion of a closed subscheme Y\subseteq X and formalize our claim earlier that formal schemes give a rigorous definition of the analytic neighorhood of a closed subscheme.

So, let X be a scheme and Y a closed subscheme of X with ideal of definition \mathcal{I}. We then define the completion of X along Y to be the direct limit \varinjlim V(\mathcal{I}^n) in the category \mathsf{LTRS}. We denote it by \widehat{Y} or, when we want to emphasize that it comes from X by \widehat{Y}_{/X}.

Our goal is to explain why \widehat{Y} is actually a formal scheme but, first, we need to prove some basic observations. In particular, we need to see that if A is an adic ring with ideal of definition I then, in fact, \mathrm{Spf}(A) is the direct limit \varinjlim V(I^n) in the category of locally topologically ringed spaces.

To begin we note that if \{(X_i,\mathcal{O}_{X_i})\} is any system of locally topological ringed spaces then the colimit in \mathsf{LTRS} is (\varinjlim X_i,\varprojlim \mathcal{O}_{X_i}). Namely, the underlying space of the colimit is the colimit of underlying spaces, and sheaf is the projective limit of the sheaves.

So, let’s consider the system \{(V(I^n)\}. Note that in this case each map V(I^n)\to V(I^{n+1}) is a homeomorphism and so, consequently, we can identify the underlying space of \varinjlim V(I^n) with just V(I). We then see that for any basic open D(f)\subseteq V(I) that the value of the sheaf on \varinjlim V(I^n) is \varprojlim A_f/I^n A_f. Indeed, under the homemorphisms V(I)\xrightarrow{\approx}V(I^n) we have that D(f) maps to D(f) and

\mathcal{O}_{V(I^n)}(D(f))=(A/I^n)_f=A_f/I^n A_f

and so the value on D(f) of the inverse limit sheaf is precisely \varprojlim A_f/I^n A_f.

Now, to see that this above locally topologically ringed space is isomorphic to \mathrm{Spf}(A) is fairly simple. Namely, note that a prime ideal \mathfrak{p}\subseteq A is open if and only if I\subseteq\mathfrak{p}. Indeed, since \{I^n\} is a system of neighborhoods of 0 we have that I^n\subseteq\mathfrak{p} for some n from which case the result follows since \mathfrak{p} is prime. Thus, as a set \mathrm{Spf}(A)=V(I). But, moreover, it’s easy to see that the topologies also coincide. Finally, since the evaluations of the two sheaves on V(I) (one coming from the colimit the other from \mathrm{Spf}(A) both give the same answer, we’re done.

So, let us go back to our claim that \widehat{Y} is a formal scheme. It’s enough to note that for an affine open U=\text{Spec}(A) of X such that \mathcal{I}\mid_U=\widetilde{I} one has that U\cap Y\subseteq Y=\widehat{Y} where the latter is only taken as a topological spaces, is open and the restriction of the sheaves gives \varinjlim V(I^n) which is \mathrm{Spf}(A). Thus, \widehat{Y} is locally an affine formal scheme and thus, consequently, a formal scheme. Thus, the ‘analytic closed neighborhood’ \widehat{Y} of a closed subscheme Y\subseteq X is, in fact, a formal scheme as desired.

Let us note that this completion functor is actually functorial in a pair (X,Y) in the following sense. Let Y\subseteq X be a closed embedding and Y'\subseteq X' be another closed embedding. Suppose further than f:X\to Y is a morphism such that that the closed embedding Y\hookrightarrow X factors through the closed embedding Y'\times_{X'}X\hookrightarrow X. Then, the morphism f induces a morphism \widehat{f}:\widehat{Y}\to \widehat{Y'}.

To see this, let us consider the affine case. Namely, let’s assume that X=\text{Spec}(A), Y=\text{Spec}(A/I) and that X'=\text{Spec}(A'), Y'=\text{Spec}(A'/I'). Let the morphism f:X\to X' correspond to the ring map f^\sharp:A'\to A. Now, Y'\times_{X'}X=\text{Spec}(A/I'A) and the assumption that Y\to X factors through this is the assumption that the ring map A\to A/I factors through A/I'A. This is equivalent to the assumption that I\supseteq I'A. Then, for each n we see that the map

f^\sharp:A'\to A/I^n

has kernel containing (I')^n and so gives rise to a ring map

A'/(I')^n\to A/I^n

and thus a map V(I^n)\to V((I')^n). Passing to the colimit gives a map \widehat{Y}\to\widehat{Y'} as claimed.

Reduced subscheme of a formal scheme

To any affine formal scheme \mathfrak{X}=\mathsf{Spf}(A) we have a naturally associated reduced scheme. Namely, the scheme \text{Spec}(A/A^{\circ\circ}) where, as usual, A^{\circ\circ} denotes the ideal of topologically nilpotent units. One can compute A^{\circ\circ} as \sqrt{I} for any ideal of definition I of A.

We would like to generalize this association to any formal scheme \mathfrak{X}. Namely, we would like to associate in a functorial manner the ‘underlying’ reduced scheme \mathfrak{X}_\text{red} of a formal scheme \mathfrak{X}. The idea is much as above. Namely, let us define the ideal sheaf \mathcal{I}_\text{red} inside of \mathfrak{X} by associating


for any affine open formal subscheme \mathrm{Spf}(A)\subseteq\mathfrak{X}. Note that \mathcal{I}_\text{red} is actually a quasi-coherent ideal sheaf: in fact,


Let us then define \mathfrak{X}_\text{red} to be the locally ringed space (\mathfrak{X},\mathcal{O}_{\mathfrak{X}}/\mathcal{I}_{\text{red}}). Evidently then \mathfrak{X}_{\text{red}} is a scheme.

We claim that this construction is functorial. Namely, let’s suppose that f:\mathfrak{X}\to\mathfrak{Y} is a map of formal schemes. We want to claim that f gives rise to a map f_\text{red}:\mathfrak{X}_\text{red}\to\mathfrak{Y}_\text{red}. Indeed, it suffices to consider this on affine formal opens. Namely, suppose that \mathrm{Spf}(A)\subseteq\mathfrak{X} and \mathrm{Spf}(B)\subseteq\mathfrak{Y} and that f(\mathrm{Spf}(A))\subseteq\mathrm{Spf}(B). This then corresponds to a continuous ring map

B\to A

But, note that if x\in B^{\circ\circ} then \lim x^n=0 and so, by continuity of the ring map f,

0=f(0)=f(\lim x^n)=\lim f(x)^n

and thus f(B^{\circ\circ})\subseteq A^{\circ\circ}. We thus get an induced map

B/B^{\circ\circ}\to A/A^{\circ\circ}

and thus an induced map of schemes


this construction obviously glues giving us the desired map f_\text{red}:\mathfrak{X}_\text{red}\to\mathfrak{Y}_\text{red}.

As an example, if Y\subseteq X is an integral closed subscheme, then \widehat{Y}_\text{red}=Y. We then imagine that \mathfrak{X} is something like a ‘colimit of infintesimal thickenings of \mathfrak{X}_\text{red}‘.

To make this precise, let us define for all n\geqslant 1 the scheme


It’s clear that X_0=\mathfrak{X}_\text{red} and that X_n\to X_m is a nilpotent thickening for m\geqslant n. Thus, we see that

\mathfrak{X}=\varinjlim X_n

in the category of locally topologically ringed spaces. Note also though that since for a map of formal schemes f:\mathfrak{X}\to\mathfrak{Y} the map

f^\sharp:\mathcal{O}_Y\to f_\ast\mathcal{O}_X

sends \mathcal{I}^Y_\text{red} into f_\ast\mathcal{I}^X_\text{red} that it sends (\mathcal{I}^Y_\text{red})^n into f_\ast (\mathcal{I}_X^\text{red})^n and thus we obtain maps X_n\to Y_n.

This allows one to think about the category of formal schemes as being inductive sequences of nilpotent thickenings (X_n) (of the obvious type) with morphisms (X_n)\to (Y_n) as being a morphism of inductive systems (f_n):(X_n)\to (Y_n).

Formal groups

Now that we have seen some basic theory of formal schemes we are well-equipped to discuss formal groups and their more specific counterparts of formal Lie groups and formal group laws.

So, let us dispose of the obvious. A formal group over \mathfrak{X}, where \mathfrak{X} is a formal scheme, is a group object in the category \mathsf{FmlSch}/\mathfrak{X} of formal schemes over \mathfrak{X} (since we’re only interested in this case we restrict ourselves to mean abelian formal group when we say formal group). Thus, it’s a formal scheme \mathfrak{G} with multiplication map


inversion map


and identity section


which satisfies the ‘usual axioms’.

Now, just as in the case of affine group schemes, an affine formal group scheme has an alternative description in terms of a ‘formal Hopf algebra’ Namely, to give \mathrm{Spf}(A) the structure of an affine formal group over \mathrm{Spf}(R) is the same as giving, for example, the ‘comultiplication’ map

m^\ast:A\to A\widehat{\otimes}_R A

and similarly for the counit and antipodal map which take the place of the identity section and inverse map respectively.

Let us give a prime example of how to get a formal group. Namely, let’s assume that G/X is a separated group scheme (separateness is automatic if X=\text{Spec}(k) and G/X is finite type). Then, we get a natural formal group \widehat{G}/X obtained by completing G along its identity section e:X\to G (which is a closed embedding since G/X is separated). The notation \widehat{G} opposed to \widehat{X} is different than our above notation, but is more accurate in some sense: this is the formal local neighborhood of the identity of G. The verification that this is, in fact, a formal group is left to the reader.

Let us assume for a second that X=\text{Spec}(R) where R is a local ring. Then, we can give a slightly simpler description of \widehat{G}. Namely, let x_0\in X be the closed point and e_0\in G the image of x_0 under e_0. Then, we claim that \widehat{G}=\mathrm{Spf}(\widehat{\mathcal{O}_{G,e_0}}).  But, this is fairly clear from the observation that the embedding e:X\to G must factor through \text{Spec}(\mathcal{O}_{G,e_0}).

Let us give some examples of this process of completion for common groups:

  • Consider group G=\mathbf{G}_a/R. Then, as a formal scheme, \widehat{G}=\mathrm{Spf}(R[[T]]), and the comultiplication map

    m^\ast:R[[T]]\to R[[T]]\widehat{\otimes}_R R[[T]]=R[[X,Y]]

    is given by

    T\mapsto X+Y

    and the antipode is given by

    T\mapsto -T

  • Consider the group G=\mathbf{G}_m/R. Then, again, as an affine formal scheme \widehat{G}=\mathrm{Spf}(R[[T-1]]) with comultiplication

    T\mapsto XY

    and antipode given by

    \displaystyle T-1\mapsto \frac{1}{T-1}

  • If E is an elliptic curve over a field k, then \widehat{E} can be described explicitly using a given Weierstrass equation for E. For more detail on this see the relevant chapter of Silvermans Arithmetic of Elliptic Curves.

Formal Lie groups and formal group laws

Now, while the general theory of formal groups is incredibly rich, there is a particular type which will be of interest to us. Namely, the so-called formal Lie groups. While these sound relatively scary, their definition is incredibly benign. Namely, a formal lie group \mathfrak{G} over a formal scheme \mathfrak{X} is one such that there is a cover of \mathfrak{X} by affine formal schemes \{\mathrm{Spf}(A_i)\} such that

\mathfrak{G}\times_\mathfrak{X}\mathrm{Spf}(A_i)\cong \mathrm{Spf}(A[[T_1,\ldots,T_m]])

for some m. Thus, they are things which are locally (on the base) power series rings.

Remark: One can try to define the notion intrinsically. I think that assuming that \mathfrak{G} is formally smooth and finite type over \mathfrak{X} might be enough. That said, I am not sure and one might require that \omega_{\mathfrak{G}/\mathfrak{X}} is locally free.

The reason for the name formal Lie groups comes from thinking about the completion \widehat{G} of an algebraic group G over the field k. Namely, we expect \widehat{G} to be a formal Lie group when G itself is a Lie group or, equivalently, when G is smooth over k. But, by the last section we know that \widehat{G} is \mathrm{Spf}(\widehat{\mathcal{O}_{G,e}}) (where e denotes the identity section as well as its image). But, by standard algebraic geometry we know that G being smooth at k (or, equivalently, smooth everywhere) implies that

\widehat{\mathcal{O}_{G,e}}\cong k[[T_1,\ldots,T_n]]

as topological rings where n=\dim G^\circ.

In particular, notice that if \mathfrak{X} is a single point (e.g. \mathrm{Spf}(\mathbb{Z}_p) or \mathrm{Spec}(k)) then being a formal Lie group over \mathfrak{X} is equivalent to the statement that \mathfrak{G}\cong \mathrm{Spf}(A[[T_1,\ldots,T_n]]) for some n where \mathfrak{X}=\mathrm{Spf}(A) (since \mathfrak{X} is necessarily affine).

To simplify matters even more, we call a a formal group \mathfrak{G}/\mathfrak{X}, where \mathfrak{X}=\mathrm{Spf}(A), a formal group law (of dimension n) if there is an isomorphism of pointed-formal schemes

(\mathfrak{G},e)\cong (\mathrm{Spf}(A[[T_1,\ldots,T_n]],0)

where, here, 0:\mathfrak{X}\to\mathrm{Spf}(A[[T_1,\ldots,T_n]]) corresponds to the continuous quotient map

A[[T_1,\ldots,T_n]]\to A[[T_1,\ldots,T_n]]/(T_1,\ldots,T_n)\xrightarrow{\approx}A

Thus, not only is \mathfrak{G} isomorphic as a formal scheme to \mathrm{Spf}(A[[T_1,\ldots,T_n]]) but it carries the identity of \mathfrak{G} to the natural zero section of \mathrm{Spf}(A[[T_1,\ldots,T_n]]).

Now, the relative simplicity of the underlying formal schemes of a formal group law allows for a more classical/low-brow description. In particular, let us define a  classical n-dimensional formal group law over A to be a set of n-power series F_i(\underline{X},\underline{Y})\in A[[X_1,\ldots,X_n,Y_1,\ldots,Y_n]] such that the following axioms hold true where X=(X_1,\ldots,X_n), Y=(Y_1,\ldots,Y_n), and F(X,Y)=(F_1(X,Y),\ldots,F_n(X,Y))

  1. F(X,F(Y,Z))=F(F(X,Y),Z)
  2. F_i(X,Y)=X_i+Y_i+\text{higher order terms}

One can then show that there must exist n-power series in the variable X i_1(X),\ldots,i_n(X) such that if i(X)=(i_1(X),\ldots,i_n(X)) then F(i(X),X)=F(X,i(X))=0.

The relationship between formal group laws and classical formal group laws is fairly clear. Namely, every classical formal group law naturally gives \mathrm{Spf}(A[[T_1,\ldots,T_n]]) the structure of a formal group law. Conversely, given a formal group law \mathfrak{G} and an isomorphism (\mathfrak{G},e)\cong (\mathrm{Spf}(A[[T_1,\ldots,T_n]]),0) we obtain a classical formal group law by transport of structure. Thus, the category of classical formal group laws (defined in the obvious way) over A is equivalent to the category of formal group laws with fixed isomorphism to (\mathrm{Spf}(A[[T_1,\ldots,T_n]]),0).

One can rephrase this in a nice way of one thinks about the functor of points \mathrm{Spf}(A[[T_1,\ldots,T_n]]). Namely, how can we describe


where \mathrm{Spf}(R)\mathfrak{X} is a formal scheme? Well, we know that this set is naturally identified with the continuous A-algebra maps A[[T_1,\ldots,T_n]]\to R. But, what are these? Well, note that any such map is determined by where the T_i maps. Since the T_i are topologically nilpotent in A[[T_1,\ldots,T_n]] we see that their images land in R^{\circ\circ} (the ring of topologically nilpotent elements).  Thus, we obtain a natural map

\text{Hom}_{\mathsf{FmlSch}}(\mathrm{Spf}(R),\mathrm{Spf}(A[[T_1,\ldots,T_n]]))\hookrightarrow (R^{\circ\circ})^n

which is easily seen to be an isomorphism since any such assignment of the T_i can be extended to all of R by completeness. Thus, we see that a classical formal group law is, essentially, a functorial way of assigning a group structure to (R^{\circ\circ})^n (for adic rings R) such that the identity element is (0,\ldots,0) (since this is where the zero section of \mathrm{Spf}(A[[T_1,\ldots,T_n]]) goes).

As an example of this, let us show how to turn \widehat{\mathbf{G}_m} into a formal group law. Namely, we need to ‘reindex’ the operations so that the counit shifts from being centered at 1 to being centered at 0. But, to do this need only do the change of variables T\leadsto T-1. Then, we see that the associated formal group laws is the power series X+Y+XY=(X+1)(Y+1)-1.

Remark: Just as in the case of p-divisible groups, formal Lie groups over \mathbb{Q}-algebras can be handled by essentially linear algebra. In particular, as in the case of normal Lie groups, formal Lie groups over \mathbb{Q}-algebras are equivalent to the category of ‘formal Lie algebras’. See Serre’s book on Lie groups and Lie algebras for a proof of this. For one-dimensional formal group laws this manifests itself by saying that all one-dimensional formal group laws \mathbb{G} are isomorphic to \widehat{\mathbf{G}_a}. This is by a generalization of the ‘exponential map’ between a Lie algebra and its Lie group. In particular, for, say, \widehat{\mathbf{G}_m} this is given by the usual logarithm map T\mapsto \displaystyle \sum_i \frac{(-1)^i}{i}T^i, which gives an isomorphism \widehat{\mathbf{G}_m}\to\widehat{\mathbf{G}_a}.

Formal groups in terms of sheaves

While it is certainly very nice to have the concrete picture of formal groups (really formal group laws) as being certain power series in A[[X_1,\ldots,X_n,Y_1,\ldots,Y_n]] it’s convenient to be able to think about these things in terms of sheaves, especially in the clarification of the construction in the next section.

To begin, let us clarify our claim that our definition of formal schemes allows us to, in a natural way, view our completions \widehat{Y} as being direct limits in the category of sheaves/presheaves on schemes.

Let us begin by making the following observation. For any affine formal scheme \mathsf{Spf}(A) and affine scheme \mathrm{Spec}(R) (thought of as a discrete affine formal scheme) we have the following natural identification

\mathrm{Hom}_{\mathsf{FmlSch}}(\text{Spec}(R),\mathrm{Spf}(A))\cong \varinjlim \text{Hom}_{\mathsf{Sch}}(\text{Spec}(R),\text{Spec}(A/I^n))

where I is any ideal of definition of A. Indeed, this follows quite easily from the fact that


and since R is discrete we know that any continuous ring map f:A\to R has open kernel. Thus, I^n\subseteq\ker(f) for some n which allows one to prove the claim. Since this identification is natural in R we deduce that

\mathrm{Hom}_{\mathsf{FmlSch}}(-,\mathrm{Spf}(A))=\varinjlim \text{Hom}_{\mathsf{Sch}}(-,\mathrm{Spec}(A/I^n))

This identification is independent of I as is clearly seen—in fact, to make things natural, one might as well take I=A^{\circ\circ}.

More generally, we claim that for any formal scheme \mathfrak{X} that we can make the following identification:


which follows from the exact same idea. Thus, again, we can think of \mathrm{Hom}_\mathsf{FmlSch}(-,\mathfrak{X}) as being \varinjlim \mathrm{Hom}_\mathsf{Sch}(-,X_n).

To make this identification useful, we want that the functor

\mathsf{FmlSch}\to \widehat{\mathsf{AffSch}}

(where \mathsf{AffSch} is the category of affine schemes) is actually fully faithful. But, this is clear from our earlier discussion that we can identify \mathsf{FmlSch} with the category of inductive systems of thickenings (X_n).

Note, moreover, that since \mathsf{AffSch} with the fppf topology is a quasi-compact site the colimit of presheaves \varinjlim \text{Hom}_{\mathsf{Sch}}(-,X_n) is still a sheaf (see this) and thus we’ve obtained an embedding

\mathsf{FmlSch}\hookrightarrow \mathrm{Sh}\left(\mathsf{AffSch}_{\text{fppf}}\right)

which allows us to think about formal schemes as being special types of fppf sheaves.

Now, there is no reason that we need to restrict our attention to affine schemes. Namely, we can really upgrade this to an embedding

\mathsf{FmlSch}\hookrightarrow \mathrm{Sh}\left(\mathsf{Sch}_\text{fppf}\right)

And, of course, we can deduce that for any scheme S this generalizes to give a fully faithful embedding

\mathsf{FmlSch}/S\hookrightarrow \mathrm{Sh}\left((\mathsf{Sch}/S)_{\text{fppf}}\right)

In particular, if \mathfrak{G}/S is a formal group scheme, then its image is an abelian fppf sheaf on S.

It would be particularly nice if we can identify when an abelian fppf sheaf \mathcal{F} is actually a formal Lie group by intrinsic properties. Somewhat surprisingly this is , in fact, the case. To state the criteria clearly we first need to setup some notation.

So, let’s assume that \mathcal{F} is an abelian fppf sheaf on S. Then, for all k\geqslant 1 we define a sheaf \mathrm{Inf}^k\mathcal{F}, called the k^\text{th}infinitesimal neighborhood of \mathcal{F}, as follows:

\mathrm{Inf}^k\mathcal{F}(T)=\left\{\alpha\in\mathcal{F}(T):\,\, \begin{matrix}\text{There exists an fppf cover }T'\to T\text{ , and a}\\ \text{nilpotent thickening } T''\hookrightarrow T'\text{ of degree }k+1\\ \text{ such that }\alpha\mid_{T''}=0\end{matrix}\right\}

One can intuitive this by the observation that if G is a separated group scheme over X, then \mathrm{Inf}^k G is essentially the k^\text{th}-order formal neighborhood of the identity section (i.e. it’s representable by V(\mathcal{I}^k) if \mathcal{I} is the ideal cutting out the identity section of G).

We shall also say that an abelian fppf sheaf \mathcal{F} over S is formally smooth if it satisfies Grothendieck’s infinitesimal lifting criterion. Namely, for all affine S-schemes \text{Spec}(A) and ideal I\subseteq A such that I^2=0 we require that the map

G(A)\to G(A/I)

is surjective.

We then have the following very nice characterization of when an abelian fppf sheaf is a formal Lie group (i.e. the image of a formal Lie group under our above embedding):

Theorem 9: Let \mathcal{F} be an abelian fppf sheaf. Then, \mathcal{F} is a formal Lie group if and only if:

  1. The sheaf \mathcal{F} is infinitesimal: \mathcal{F}=\varinjlim\mathrm{Inf}^k\mathcal{F}.
  2. The sheaf is formally smooth.
  3. Each \mathrm{Inf}^k\mathcal{F} is representable.

This is extremely convenient because it allows the creation of formal Lie groups in a much more palatable way. Namely, we first construct an abelian fppf sheaf and then verify that it satisfies the above properties. This is generally much simpler since the category of fppf sheaves is much larger and so, consequently, much easier to construct objects in. One should think about this like most results of a similar flavor. For example, the category of algebraic spaces/algebraic stacks is nice, because it gives us a huge category to construct objects in and then later, if desired, try use the properties of our constructed algebraic spaces/stack to show actually lives in the category of schemes

Tate’s theorem

Now that we have some basic language down, we are in the position to ask how formal groups and p-divisible groups relate. The answer is a famed theorem of Tate. But, to state it rigorously we first need to whittle down the set of formal groups/p-divisible groups we are going to consider.

Let us say that an n-dimensional formal group law \mathfrak{G}/\text{Spf}(A) is p-divisible if the continuous map [p]^\ast:A[[T_1,\ldots,T_n]]\to A[[T_1,\ldots,T_n]], associated to the multiplication by p map [p], makes A[[T_1,\ldots,T_n]] into a free module over itself.

Let us give two examples/non-examples:

  • Consider the formal group law \widehat{\mathbf{G}_m} over \mathbb{Z}_p. Then, this is p-divisible. Indeed, the map [p]^\ast sends T to T^p, and thus we’re asking whether \mathbb{Z}_p[[T]][X]/(X^p-T) is a free \mathbb{Z}_p[[T]]-module, which, of course, it is.
  • Consider the formal group law \widehat{\mathbf{G}_a} over \mathbb{Z}_p. We claim that this is not divisible. Indeed, [p]^\ast is given by T\mapsto pT. But, if this were a free module, then by base changing to \mathbb{F}_p, we’de have the the zero map \mathbb{F}_p[[T]]\to \mathbb{F}_p[[T]] is a free map which is ridiculous.

The intuition  is that \mathbb{G} should be p-divisible if \mathfrak{G}(R) is a p-divisible group (in the usual sense from group theory) for all A-algebras R. Essentially this because the the fiber of x\in \mathfrak{G}(R) under the map [p]:\mathfrak{G}(R)\to\mathfrak{G}(R) should be be the fibered product of the diagram

\begin{matrix} &  & \mathfrak{G}\\ & & \downarrow^{[p]}\\ \mathrm{Spf}(R) &\xrightarrow{x} & \mathfrak{G}\end{matrix}

which is going to be non-empty. In particular, if \mathfrak{G}=\text{Spf}(T) then this fibered product is T\widehat{\otimes}_T R which is non-zero since T is free over T by [p]^\ast.

Now, one can show the following fairly easily (although we omit the details):

Theorem 10: Let A be a complete Noetherian local ring with residue characteristic p, and let \mathfrak{G} be an n-dimensional p-divisible formal group law. Then, the scheme

\displaystyle \mathfrak{G}[p^n]:=\mathrm{Spec}\left(\frac{A[[T_1,\ldots,T_n]]}{([p^n]^\ast(T_1),\ldots,[p^n]^\ast(T_n))}\right)

is a finite flat group scheme of p-power order. Moreover, |\mathfrak{G}[p^n]|=|\mathfrak{G}[p]|^n.

This follows, essentially, from the above remarks. Namely, since \mathfrak{G}[p^n] should be represented, at least locally, by T\widehat{\otimes}_T A which is of rank p^n over A.

Thus, we see that the inductive system (\mathfrak{G}[p^n]) forms a p-divisible group over A which we denote \mathfrak{G}[p^\infty]. The astounding observation of Tate is then the following:

Theorem 11 (Tate’s theorem): Let A be a complete Noetherian local ring with residue characteristic p. Then, the association \mathfrak{G}\leadsto \mathfrak{G}[p^\infty] is an equivalence of categories between p-divisible formal Lie groups over R and connected p-divisible groups over R.

One observation to make is the following. One can show that any formal \mathfrak{G} Lie group over \mathrm{Spec}(A) is necessarily a formal group law (i.e. \mathfrak{G}\cong\mathrm{Spec}(A[[T_1,\ldots,T_n]])) and thus in this setup all we need to consider are formal group laws.

Now, we won’t attempt to prove Tate’s theorem (a nice description of the results can be found here), but we can at least describe the inverse map. Namely, let G/A be a connected p-divisible groups. Note that each G[p^n]\hookrightarrow G[p^{n+1}] must be an infinitesimal thickening (by Artinian considerations) and thus \{G[p^n]\} actually forms a family of infinitesimal thickenings. Thus, let \mathfrak{G}=\varinjlim G[p^n] be the formal group one gets by taking the direct limit in \mathsf{LTRS}. This is the inverse to \mathfrak{G}\mapsto \mathfrak{G}[p^\infty].

Using our discussion of formal Lie groups as abelian fppf sheaves this can be phrased much nicer. Namely, it becomes tautological in some sense. Namely, the claim is that given a connected p-divisible group G/A, the abelian fppf sheaf that is G is a Lie group (i.e. it satisfies the conditions of Theorem 9). Conversely, if \mathfrak{G}/A is a p-divisible formal group law then \mathfrak{G}, thought of as an fppf sheaf, is p-divisible. Thus, this becomes less of an association and, instead, becomes a verification check.

Remark: Depending on your bend, this last rephrasing of Tate’s theorem is terrible. We’ve taken something really concrete like formal group laws and turned them into something as incomprehensible as fppf sheaves. In some sense, I agree with this sentiment. I chose to phrase it in this sense, or at least mention the phrasing, for two reasons. First, it is this perspective which more easily generalizes (e.g. as in Messing’s dreaded text), but it also makes the association seem more, well, natural.

When I first learned about this result I was extremely confused. Why should formal groups and p-divisible groups have anything to do with one another? Thinking about things in terms of fppf sheaves makes this entirely clear. Both p-divisible groups and formal Lie groups are abelian fppf sheaves and, as mentioned above, the connection then becomes a study of properties of some fppf sheaf opposed to a (to me) magical association of ostensibly disparate objects.

With this theorem in hand, for a p-divisible group G/R we define the dimension of G to be the dimension of the formal Lie group associated to G^\circ.

Let us look at some examples:

  • The formal Lie group associated to the p-divisible group\mu_{p^\infty}/\mathbb{Z}_p is \widehat{\mathbf{G}_m}. Thus, we see that the dimension of \mu_{p^\infty} is 1.Note that implicit in the statement that \widehat{\mathbf{G}_m} is p-divisible is the somewhat nebulous claim that \widehat{\mathbf{G}_m}=\varinjlim \mu_{p^n} or, written in the language of rings

    \mathbb{Z}_p[[T]]=\varprojlim \mathbb{Z}_p[T]/(T^{p^n}-1)

    which, to me, is non-obvious (even though it’s true!). Of course, this is much more obvious over the residue field where the right hand side reduces to \varprojlim \mathbb{F}_p[T]/(T-1)^{p^n}.

  • The formal Lie group associated to E[p^\infty], where E is an elliptic curve, is \widehat{E} (which is a formal Lie group since E is smooth). Thus, we see that the dimension of \widehat{E} is also 1.

Again, while we won’t discuss the proof of Tate’s theorem, it’s worth mentioning that the hardest part of the theorem is showing that if G is a connected p-divisible group then it’s formally smooth G. This is especially surprising considering that none of the finite pieces G[p^n] are formally smooth! This, again, highlights our earlier remark that p-divisible groups as a system/sheaf enjoy properties their finite constituents could only dream to have.

A nice result in this direction is the following powerful theorem of Messing which, at least spiritually, is what is going on above (although at a much more sophisticated level):

Theorem(Messing): Let S be a scheme on which p is locally nilpotent. Then, every p-divisible group on S is formally smooth.

The sort of amazing thing is that combining Tate’s theorem with the connected étale sequence, assuming R is a complete local Noetherian ring (so that both apply), we then see that any p-divisible group looks like the following. So, it’s made up of something like a component group \pi_0(G)=G^\mathrm{\acute{e}t} and G^\circ the torsion in an infinitesimal neighborhood of a Lie group.

This suggests that it’s really G^\circ which is the ‘substantive’ part of G, and thus the dimension of G (as defined above) really is measuring something like the actual size of G.

The Serre-Tate theorem


As stated in the motivation, the Serre-Tate theorem relates the deformation theory of abelian schemes to the deformation theory of their p-divisible groups. So, let us begin by making this more precise.

Let S_0 be a scheme, and S_0\hookrightarrow S an infinitesimal thickening. Assume moreover that p is locally nilpotent on S. Let us define two categories associated to these objects. First, we have the category \mathsf{AbVar}(S) of abelian schemes A/S. Second, we have the category \mathcal{C}(S_0) which consists of triples (A_0,G,\iota) where A_0/S_0 is an abelian scheme, G/S is a p-divisible group, and \iota is an isomorphism G_{S_0}\xrightarrow{\approx}A_0[p^\infty]—the morphisms in \mathcal{C}(S_0) are the obvious ones.

We then have a natural functor \Phi:\mathsf{AbVar}(S)\to\mathcal{C}(S_0) given by taking A/S to (A_{S_0},A[p^\infty],\mathrm{id}). The Serre-Tate theorem then says the following:

Theorem 12(Serre-Tate): The morphism \Phi is an equivalence of categories.

In particular, we see that if we fix A_0, then the above says that there is an equivalence of categories between deformations(=lifts) of A_0 to S, and deformations of A_0[p^\infty] to S.

We will not give a proof of this theorem here. The modern proof is due to Drinfeld and is found at the beginning of Katz’s article Serre-Tate Local Moduli which, surprisingly, is relatively simple. That said, it’s only simple if one takes for granted the stated-above result of Messing considering the formal smoothness of p-divisible groups over our considered S.

It should be said, in general, that proving that abelian schemes deform is actually not that difficult. Once one understands that the deformation space of deforming a scheme X_0/S_0 to X/S is H^2(A_0,T_{A_0}\otimes\mathcal{I}) (where \mathcal{I} is the ideal sheaf), then one can just leverage multiplication map m:A_0\times A_0\to A_0 to show that the obstruction class o(A_0) is trivial. One can play the same game to lift the multiplication map.

Remark: For those who aren’t familiar with deformation theory, let me clarify some of the above. A deformation space associated to a deformation problem, say deforming ‘blah’s over S_0 to S, is a vector space V such that for any blah B_0/S_0 there is a class(=element) o(B_0)\in V which is zero if and only if B_0 has a deformation. For the deformation problem of deforming smooth schemes, the deformation space for deforming from S_0 to S is H^2(X_0,T_{X_0/S_0}\otimes\mathcal{I}), where T_{X_0/S_0} is the tangent bundle, and we view \mathcal{I} (which is an ideal sheaf on X, or equivalently X_0, since they have the same underlying topological space) as a sheaf on X_0 via pullback from X_0\to S_0.

No, the deep thing is not that we can deform, but that we can understand the deformations (and, in particular, the maps between them) purely in terms of the associated p-divisible groups. This, as stated in the motivation, comes down to linear algebra as made precise by Grothendieck-Messing theory which says, in a glib sense, that deforming a p-divisible group is the same thing as deforming its Hodge filtration (a subbundle of its Dieudonne crystal) which, in theory, is purely linear algebra.

Cohomological consequence

The Serre-Tate theorem also has a fascinating consequence in explaining how one might adapt our ‘linear algebra’ understanding of elliptic curves over \mathbb{C} to an understanding of elliptic curves over, say, \mathbb{Q}_p.

Recollection of the complex picture

Remark: This is purely review in some sense. This material is probably well-known to many readers. So, unless you don’t understand the slogans at the end of this section (those in block quotes) you are probably safe to skip this section.

Let us being by explaining what is meant by our ‘linear algebra’ understanding of elliptic curves (or abelian varieties!) over \mathbb{C}. Namely, it’s a well-known fact that every abelian variety A/\mathbb{C} has the property that its analytification is a torus. Less cryptically

A^\text{an}\cong X_\Lambda

as a compact complex Lie group where \Lambda\subseteq\mathbb{C}^n is a (full) lattice and X_\Lambda:=\mathbb{C}^n/\Lambda. Then, by GAGA, this gives us a fully faithful embedding

\mathsf{AbVar}(\mathbb{C})\hookrightarrow \left\{\begin{matrix}\text{Complex tori with}\\\text{holomorphic group maps}\end{matrix}\right\}

This is particularly nice because complex tori can, in a rigorous way, be understood in terms of linear algebra.

More specifically, any map of complex Lie groups f:X_\Lambda\to X_{\Lambda'} is of the form


where T is a \mathbb{C}-linear map \mathbb{C}^n\to \mathbb{C}^m (if 2m is the rank of \Lambda') such that T(\Lambda)\subseteq T(\Lambda'). Thus, in fact, we get an injection

\text{Hom}(X_\Lambda,X_{\Lambda'})\hookrightarrow \text{Hom}(\Lambda,\Lambda')

which gives us the sense that, perhaps, we might be able to understand complex tori, and thus abelian varieties, entirely in terms of linear algebra objects (i.e. lattices).

Of course, there are a few obstacles to giving such a purely linear algebra description. They are as follows:

  • How do we canonically describe the lattice \Lambda for a torus X_\Lambda?
  • Amongst tori, how do we pick out the abelian varieties?
  • What maps of lattices \Lambda\to\Lambda' actually come from maps of tori X_{\Lambda}\to X_{\Lambda'}?

Two of these questions are ‘easy’ (with the correct perspective) and one is non-trivial. Specifically, we shall be able to answer the first and last question by sitting below a waterfall and attempting to divine the face of (linear algebra) God. The second is a deep theorem that we shall only state.

So, let us begin with the easiest of these questions, the first one. Namely, given just the complex Lie group X_\Lambda how can we functorially recover the lattice \Lambda? The key observation is that the quotient map

q:\mathbb{C}^n\to X_\Lambda

is evidently a covering map, and thus \pi_1(X_\Lambda,0) is naturally identified with the deck coverings of q. That said, it’s obvious that these are just \Lambda itself. Thus, an internal-to-the-torus way of describing \Lambda is as the fundamental group of the torus or, to make things more ‘linear algebra-esque’, the homology. Indeed, since \pi_1(X_\Lambda,0) is abelian, we can guilt-free identify it with H_1(X_\Lambda,\mathbb{Z}).

Ok, so, now the question is what the image of the map


is. Namely, which maps of tori \Lambda\to\Lambda' come from maps of tori. Let us begin by noticing that realizing \Lambda as H_1(X_\Lambda,\mathbb{Z}) provides \Lambda with more structure than that of just an abelian group. In particular, we have the following chain of natural isomorphisms:

H_1(X_\Lambda,\mathbb{Z})\otimes_\mathbb{Z}\mathbb{R}\cong \Lambda\otimes_\mathbb{Z}\mathbb{R}\cong \mathbb{C}^n

Indeed, the definition of \Lambda\subseteq\mathbb{C}^n being a full lattice is that it contains an \mathbb{R}-basis for \mathbb{C}. That said, note that this last isomorphism actually a bit more information. The first two isomorphisms were purely internal to the lattice, in some sense, but the last really depended on the embedding of \Lambda into \mathbb{C}. In particular, the last isomorphism imbues H_1(X_\Lambda,\mathbb{Z})\otimes_\mathbb{Z}\mathbb{R} with a structure it wouldn’t otherwise have—a complex structure.

It’s also clear that this complex structure is the key to answering the third bulleted question. Namely, a map of lattices \Lambda\to\Lambda' comes from a map of tori X_\Lambda\to X_{\Lambda'} precisely when the induced \mathbb{R}-linear map

\Lambda\otimes_\mathbb{Z}\mathbb{R}\to \Lambda'\otimes_\mathbb{Z}\mathbb{R}

is, in fact, complex linear (with the above mentioned complex structures). But, there is an even more pleasing way to phrase this result which opens it up to vast generalization.

Let us recall that a R-Hodge structure, for R\subseteq\mathbb{C} some ring, is a free R-module M together with a fixed decomposition

\displaystyle M\otimes_\mathbb{R}\mathbb{C}=\bigoplus_{p,q}V^{p,q}

with V^{p,q} a \mathbb{C}-subspace, such that, under the conjugation operation

\overline{m\otimes z}:= m\otimes\overline{z}

we have that \overline{V^{p,q}}=V^{q,p}. We call the set (p,q) such that V^{(p,q)}\ne 0 the type of M.

Remark: For the rest of this post we shall write, for an R-module M and R-algebra S, the S-module M\otimes_R S as M_S

The most classical example, and why one might imagine that Hodge structures are coming into play here, is that provided by complex Hodge theory. Namely, if X is a compact Kahler manifold (e.g. a projective algebraic variety) then Hodge theory provides a canonical decomposition

\displaystyle H^i_\text{sing}(X,\mathbb{Z})_\mathbb{C}\cong\bigoplus_{p+q=i}H^q(X,\Omega^p_{\text{hol.}})

(for details and notation see this post) thus making H^1_{\text{sing}}(X,\mathbb{Z}) into a \mathbb{Z}-Hodge structure.

So, now, the most apropos aspect of Hodge structures is the following statement: giving an \mathbb{R}-space a complex structure is equivalent to making it into a \mathbb{R}-Hodge structure of type \{(-1,0),(0,-1)\}. Explicitly, for a complex structure on V (an \mathbb{R}-space) one considers the decomposition

V_\mathbb{C}=V^{-1,0}\oplus V^{0,-1}


V^{-1,0}=\left\{w\in V\otimes_\mathbb{R}\mathbb{C}:J(w)=iw\right\}

where J(v\otimes z)=J(v)\otimes z where J:V\to V is the map defining the complex structure (i.e. the ‘multiplication by i‘ map) and V^{0,-1} is the points where J acts by -i.

Extended remark: Since this is not supposed to be an exposition on Hodge structures, I don’t want to go too in-depth into the explanation of the previous paragraph. That said, I think it’s worth mentioning indentically (the indent version of parenthetically) roughly why this is true.

Recall that to give a grading of a vector space V is the same thing as giving a map of algebraic groups \mathbf{G}_{m,k}\to\text{GL}(V). So, not shockingly, giving a bigrading (i.e. a \mathbb{Z}^2-grading) amounts to giving a map of algebraic groups \mathbf{G}_{m,k}^2\to\text{GL}(V).

So, note that since a \mathbb{R}-Hodge structures comes with a bigrading of V_\mathbb{C} we get a map of algebraic groups \mathbf{G}_{m,\mathbb{C}}^2\to\text{GL}(V_\mathbb{C}). That said, it’s not just any bigrading. We need the fact that conjugation switches the indices. Thus, we’re not only specifying a representation of \mathbf{G}_{m,\mathbb{C}}^2 on V_\mathbb{C} but how conjugation acts. This is, not shockingly, some sort of descent data down to a representation over \mathbb{R}—but a representation of what?

It should be a group G/\mathbb{R} such that G_{\mathbb{C}}=\mathbf{G}_{m,\mathbb{C}}^2 and such that to give a representation G\to\text{GL}(V) amounts to giving a representation \mathbf{G}_{m,\mathbb{C}}^2\to\text{GL}(V_\mathbb{C}) such that conjugation amounts to ‘switching the factors’ in \mathbf{G}_{m,\mathbb{C}}^2 (which amounts to switching the indices in the bigrading). Thus, G/\mathbb{R} should be an \mathbb{R}-torus whose character lattice X^\ast(G) is \mathbb{Z}^2 and such that \text{Gal}(\mathbb{C}/\mathbb{R}) acts by (having conjugation) send (a,b) to (b,a). An explicit description of this group is as the Weil restriction \text{Res}_{\mathbb{C}/\mathbb{R}}\mathbf{G}_{m,\mathbb{C}} which is often times called the Deligne torus and denote \mathbb{S}.

Thus, summing this all up, we see that to give an \mathbb{R}-space V the data of an \mathbb{R}-Hodge structure amounts to giving a homomorphism \mathbb{S}\to\text{GL}(V). In the language of Tannakian categories this amounts to the statement that the neutral Tannkaina category \mathsf{Hdg}_\mathbb{R} of \mathbb{R}-Hodge structures (with the obvious fiber functor) has fundamental group \mathbb{S}.

We often times normalize the above so that V^{p,q} is the space where \mathbb{S}_\mathbb{C} acts via the character z^{-p}\overline{z}^{-q}. This somewhat strange indexing was standardized by Deligne, and is useful when one is thinking cohomologically opposed to homologically.

What does this have to do with complex structures? Well, the \mathbb{R}-points of the Deligne torus are, as a real Lie group, \mathbb{C}^\times. Thus, one can conflate Hodge structures, which are algebraic homomorphisms \mathbb{S}\to\text{GL}(V), with Lie group homomorphisms \mathbb{C}^\times\to\text{GL}(V) (where, now, the right hand side is thought of as a real Lie group). But, note that to give a complex structure on V is to give a map of \mathbb{R}-algebras \mathbb{C}\to \text{End}(V) (which is automatically smooth because it’s linear). Thus, from a complex structure \mathbb{C}\to\text{End}(V) we obtain a Hodge structure \mathbb{C}^\times\to\text{GL}(V) by restricting to the map on units. One can check that its essential image in \mathbb{R}-Hodge structures is precisely those of type \{(-1,0),(0,-1)\}.

Thus, we can exactly describe the image of \text{Hom}(X_\Lambda,X_{\Lambda'}) in \text{Hom}(\Lambda,\Lambda') in terms of these \mathbb{Z}-Hodge structures. Namely, the image is precisely those maps \Lambda\to\Lambda' which are also maps of \mathbb{Z}-Hodge structures of type \{(-1,0),(0,-1)\} (i.e. those maps such that the induced map \Lambda_\mathbb{C} to \Lambda'_\mathbb{C} sends the (-1,0) part of \Lambda_\mathbb{C} to the (-1,0) part of \Lambda'_\mathbb{C} and the same for (0,-1)). Thus, we obtain a fully faithful embedding

\left\{\begin{matrix}\text{Complex tori with}\\ \text{holomorphic group maps}\end{matrix}\right\}\hookrightarrow \left\{\begin{matrix}\mathbb{Z}\text{-Hodge structures}\\ \text{of type }\{(-1,0),(0,-1)\}\end{matrix}\right\}

but, in fact, this map is an equivalence.

It remains to see why this functor is fully-faithful. But, suppose that \Lambda is a \mathbb{Z}-Hodge structure of type \{(-1,0),(0,-1)\}. Then, \Lambda_\mathbb{R} is (because of the Hodge structure) a complex vector space and we have a canonical embedding \Lambda\hookrightarrow \Lambda_\mathbb{R} realizing \Lambda as a (full) lattice in the complex vector space \Lambda_\mathbb{R}. Thus, the quotient \Lambda_\mathbb{R}/\Lambda has the structure of a complex torus whose associated \mathbb{Z}-Hodge structure of type \{(-1,0),(0-,1)\} is \Lambda.

Thus, we have, in essence, totally understood complex tori in terms of pure linear algebra data—they’re \mathbb{Z}-Hodge structures of type \{(-1,0),(0,-1)\}. So, in essence, we’ve answered the first and third bulleted questions. Again, as predicted, these were answered purely by mental gymnastics of a technical sort—nothing deep happened above. But, we have one question left to be able to discuss. Namely, composing the two embeddings we’ve discussed in this section so far we obtain an embedding

\mathsf{AbVar}(\mathbb{C})\hookrightarrow\left\{\begin{matrix}\mathbb{Z}\text{-Hodge structures}\\ \text{of type }\{(-1,0),(0,-1)\}\end{matrix}\right\}

but, to complete our linear algebra saga, we’d like to describe the essential image in a purely linear algebraic way. This is the deep part.

The idea is that a torus X_\Lambda is algebraic if and only if \Lambda possesses a so-called Riemann form. The idea, roughly, is that if one treats such a Riemann form as a line bundle, through the Appell-Humbert theorem, that it’s ample, and thus X_\Lambda is projective, so algebraic. This never comes up for elliptic curves since all one-dimensional complex tori are algebraic (in fact, as is well-known, all compact Riemann surfaces are algebraic).

The upshot of this all is that a Riemann form is a purely linear algebraic object on \Lambda. In fact, in terms of Hodge structures it’s a so-called polarization. Thus, using this we can make the following equivalence of categories

\mathsf{AbVar}(\mathbb{C})\xrightarrow{\approx}\left\{\begin{matrix}\text{Polarizable }\mathbb{Z}\text{-Hodge structures}\\ \text{of type }\{(-1,0),(0,-1)\}\end{matrix}\right\}

which is, sometimes, attributed to Riemann and called Riemann’s theorem (although that phrase is quite overburdened as is).

Let us hammer in the final point as follows. Even though we introduced Hodge structures in terms of bigradings there is another, perhaps more natural, way of thinking about them. Namely, let us say that a R-Hodge structure M is pure of weight n if for all (p,q) in the type of M we have that p+q=n. Then, associated to such a pure Hodge structure is a filtration \mathrm{Fil}^\bullet M_\mathbb{C} called, imaginatively, the Hodge filtration. It’s defined as follows:

\displaystyle \text{Fil}^i M:= \bigoplus_{p\geqslant i}V^{p,q}

where V^{p,q} are the constituents of the decomposition of M_\mathbb{C}. One can show that keeping track of the bigrading is the same as keeping track of this filtration. So, at least for pure Hodge structures, their study amounts to studying a filtration (see, again, this for more details). The reason why the filtration is more natural than the decomposition is that the filtration is really the thing given to us algebraically (ibid.) and, in cases more general than varieties over \mathbb{C}it’s all we have (i.e. the filtration doesn’t have a natural splitting).

All of this, all of the above material, has been to make the following slogan semi-precise:

“Abelian varieties over \mathbb{C}=topology+linear algebra of filtrations.”

Of course, in all of this, abelian varieties have been somewhat of a red herring. Namely, as obnoxious/scary as this might sound (depending on your predilection), the above could be generalized to something like

“Motives over \mathbb{C}=topology+linear algebra of filtrations.”

and if one is so bold as to assume that a category of motives exists, and is sufficiently rich, then one might even strengthen the above to the grandiose statement:

“Algebraic geometry over \mathbb{C}=topology+linear algebra of filtrations.”

In our humble case of abelian varieties this come down to the statement that to understand an abelian variety A/\mathbb{C} is to understand the pair (H_1(A^\text{an},\mathbb{Z}),\text{Fil}^\bullet H_1(A^\text{an},\mathbb{Z})).

The situation over p-adic fields

Now, while all of the above may have been a slog (depending whether or not one has seen it all before) it really is a powerful tool. This is exhibited by the fact that the study of abelian varieties over \mathbb{C} is considerably, considerably simpler than that over, say, \mathbb{Q}—it’s just topology and linear algebra. This is what makes the study of elliptic curves (and more generally abelian varieties) accessible to undergraduates with no real knowledge of algebraic geometry. It’s also what makes ‘reduction to the case over \mathbb{C}‘ such a common phrase in the study of elliptic curves (or abelian varieties).

Unfortunately, its application to abelian varieties over other fields, while powerful (especially in characteristic 0), has serious limitations. Indeed, even though one can reduce many questions about elliptic curves (or abelian varieties) to the case over \mathbb{C} many, many cannot. In particular, the fact that the homomorphisms can be described in terms of linear algebra data is not adaptable to working over, say, \mathbb{Q}. That said, one can make the appropriate generalization to working over p-adic fields like \mathbb{Q}_p. Not shockingly though, this is at the cost of much, much heavier machinery.

In fact, whereas the above discussion about abelian varieties over \mathbb{C} was subsumed (implicity) in the study of Hodge theory, the analogue over p-adic fields is subsumed in the notions of p-adic Hodge theory (see this post here for a very bad motivation for what this might be, from a geometric perspective). We will not expound upon this point at length (beyond the comments at the end of the post) but remark that the geometric version of p-adic Hodge theory can be understood broadly as a desire to understand the geometry of a variety X/\mathbb{Q}_p in terms of its p-adic cohomology—very similar in mind to the previous section.

So, with this in mind, what we’d like to do is to give some justification to the claim that

“Algebraic geometry over K=algebraic geometry over k+linear algebra of filtrations”

where K is a p-adic local field and k its (finite) residue field. The reason that we need to work with finite fields opposed to working with topology is somewhat complicated. One explanation is that its really finite fields for which things work ‘topologically’—explicitly, in our case, the Tate conjecture (an analogue of the Hodge conjecture, see this post) holds for abelian varieties over finite fields. This is also not too much of an issue considering the fact that algebraic geometry over a finite field is considerably easier than algebraic geometry over p-adic field.

We should also not expect to get anything close to the exact results we had over \mathbb{C}. Roughly the issue is that for a variety X/K, for K some arbitrary characteristic 0 field, we have no good notion of an integral cohomology theory—we can not functorially define an analogue of the lattice \Lambda. The issue is that algebra (in particular algebraic geometry) cannot access infinite topological covers, but can access finite ones. The classic example is that if we consider the algebraic curve \mathbf{G}_m over \mathbb{C} then the topological covers of its analytification are either the finite covers

\mathbb{C}^\times\xrightarrow{z\mapsto z^n}\mathbb{C}^\times

or the universal cover


and only the former is algebraic.

But, we are not able to recover just finite covers but, in some sense, limits of finite covers. In other words, we may not be able to recover the lattice \Lambda, but we can recover its profinite comlpletion or its various \ell-adic constituents. Explicitly, while


cannot be algebraically defined one can define


and, in particular the limit

\varprojlim H_1(A^\text{an},\mathbb{Z}/\ell^n\mathbb{Z})=\Lambda\otimes_\mathbb{Z}\mathbb{Z}_\ell

Indeed, this follows from the comparision theorem of Artin that says that


and thus


and this righthand object is defined for A/K where K is any field. In fact, it’s just T_\ell A—the \ell-adic Tate module of A.

Thus, to summarize, we can’t hope to get the exact analogue of the results in the previous section since we can’t access \Lambda algebraically. But, we can access \Lambda\otimes_\mathbb{Z}\mathbb{Z}_\ell algebraically in the form of the \ell-adic Tate module T_\ell A. Thus, we’ll expect to be able to not say something super precise about the \mathbb{Z}-module \text{Hom}(A,A') but, instead, something about the \mathbb{Z}_\ell-module \text{Hom}(A,A')\otimes_\mathbb{Z}\mathbb{Z}_\ell.

But, this is still not right. As we explained at the beginning of this post, the \ell-adic Tate module T_\ell A of an abelian variety A/K where K is a p-adic field (where, of course, \ell\ne p) doesn’t contain as much information as it should. Thus, we don’t expect this to actually be our key player here, instead we expect T_p A to be the object of optimal importance. This is, essentially, right. But, instead of T_p A directly, we’re instead going to be interested in a sort of manifestation of it, something like a de Rham cohomology. This is not shocking either considering the important rule that the Hodge filtration played in the previous section and the fact that the Hodge decomposition/Hodge filtration is really a property of de Rham cohomology (the singular cohomology being brought in only because its integral properties).

This is where things start to get fairly technical. As we said above, we’re going to try and recover morphisms between abelian varieties over K (a p-adic field) using algebraic geometry over k and some sort of filtration. Not shockingly, this is going to be a ‘Hodge like filtration’ (and exactly a Hodge filtration when viewed correctly). Unfortunately, de Rham cohomology doesn’t act well in characteristic p. This is for the exact same reason that Lie algebras don’t work in characteristic pd(x^p)=0. Thus, what is really going to show up in our analogue of the previous section for abelian varieties over K is the ‘correct technical analogue’ of de Rham cohomology in characteristic p: crystalline cohomology.

We shall explain, very briefly, what crystalline cohomology is but, first, let us state (to keep us motivated) the result which will be our analogue of the previous section:

Theorem 13: Let A and A' be abelian varieties over K with good reduction, say with models \mathscr{A} and \mathscr{A}' over \mathcal{O}_K. Then, there is an injection

\text{Hom}(A,A')\hookrightarrow \text{Hom}_{F,V}(H^1_\text{crys}(\mathscr{A}'_k/W(k)),H^1_\text{crys}(\mathscr{A}/W(k)))

whose image consists of those morphisms on cohomology which come from a morphism \mathscr{A}_k\to\mathscr{A}'_k.

This then gives an injection

\text{Hom}(A,A')\otimes_\mathbb{Z}\mathbb{Z}_p\hookrightarrow \text{Hom}_{F,V,\text{Fil}}(H^1_\text{crys}(\mathscr{A}'_k/W(k)),H^1_\text{crys}(\mathscr{A}_k/W(k)))

where the subscript ‘\text{Fil}‘ means preserves the Hodge filtration.

The proof of this result requires a non-trivial amount of machinery, but sort of its main two components are the extended Dieudonné theory developed by Grothendieck and Messing and, apropos for our post, Serre-Tate theory.

Before we define, in a cheating way, crystalline cohomology, let us just remark what this Dieudonné theory idea is that came up in the last paragraph. As this post began, one of the main ideas we were trying to remedy by discussing p-divisible groups was the non-faithfullness of taking the Tate module of a p-divisible group in characteristic p (unless that p-divisible group was étale). So, we’d hope that even if our p-divisible group is not étale that there is some sort of linear algebra object we can attach to it which serves as a better substitute. This is, in essence, what Dieudonné theory does—provides a linear algebra object which controls the p-divisible group.

So, without further adieu, let us give a ‘bad’ definition of crystalline cohomology which works in our case. It’s ‘bad’ because, while it works, it is missing a huge amount of the story and doesn’t generalize well to other situations. Namely, let’s suppose that A/K is an abelian variety with good reduction as above, and that has model \mathscr{A}/\mathcal{O}_k. Then, let us define the crystalline cohomology of the special fiber \mathscr{A}_k as follows:

H^i_\text{crys}(\mathscr{A}_k/W(k)):= H^i_\text{dR}(\mathscr{A}/W(k))

where, here, H^i_\text{dR}(\mathscr{A}/W(k)) is the algebraic de Rham cohomology of \mathscr{A} defined as the hypercohomology of the de Rham complex \Omega^\bullet_{\mathscr{A}/W(k)} (for more information in the case when the base is a field see this post).

Then, from this, it’s obvious that the crystalline cohomology (in this) case comes with a filtration since all de Rham cohomology does (ibid.). What’s less obvious is that H^i_\text{crys}(\mathscr{A}_k/W(k)) comes equipped with two operators F and V called Frobenius and Verschiebung respectively. They are, respectively, \sigma-linear and \sigma^{-1}-linear operators on the free W(k)-module H^i_\text{crys}(\mathscr{A}_k/W(k)) where \sigma:W(k)\to W(k) is the lift of the Frobenius. This is an example where our definition of crystalline cohomology is poorly chosen—it’s not easy to define these F and V in this way.

The relationship between crystalline cohomology and Dieudonné theory is a famous result of Mazur-Messing-Oda (originally due to Oda, but later improved upon by Mazur-Messing) that


which, in words, says that the first crystalline cohomology group of the special fiber agrees with the Dieudonné module of the p-divisible group of the special fiber. Then, it’s believable that there are such F and V opreators since they’re induced, by functoriality, from the Frobenius and Verschiebung operations of the p-divisible group \mathscr{A}_k[p^\infty]. What’s less obvious from this description (i.e. if one thinks about crystalline cohomology as a Dieudonné module) is the fact that the first crystalline cohomology has a filtration. This, in the Dieudonné sense, comes from the work of Messing.

Let us then proceed with the proof of Theorem 13 which will make full use of all of the theory described above, as well as Serre-Tate theory:

Remark: In the following proof all maps of Dieudonné modules/crystalline cohomology will be assumed to preserve Forbenius and Verschiebung.

Proof(Theorem 13): Let us begin by noticing that since \mathscr{A} and \mathscr{A}' are the Néron models of A and A' that the equality


holds. But, by Formal GAGA we have the equality


where, here, the hat denotes the completion of the special fiber. But, then we know that

\text{Hom}_{\mathsf{Spf}(\mathcal{O}_K)}(\widehat{\mathscr{A}},\widehat{\mathscr{A}'})=\varprojlim \text{Hom}(\mathscr{A}_n,\mathscr{A}'_n)



where, of course, \mathfrak{p}_K is the maximal ideal of \mathcal{O}_K.

Now is when Serre-Tate theory comes into play. Namely, it tells us that

\text{Hom}(\mathscr{A}_n,\mathscr{A}'_n)=\left\{(f,g):\begin{aligned}&1)\quad f:\mathscr{A}_n[p^\infty]\to\mathscr{A}'_n[p^\infty]\\ &2)\quad g:\mathscr{A}_{k}\to\mathscr{A}'_{k}\end{aligned}\text{  such that }g[p^\infty]=f\mod \mathfrak{p}_K\right\}

But, by Messing’s Dieudonné theory this data is equivalent to a map g:\mathscr{A}_{k}\to \mathscr{A'}_{k} such that the map D(\mathscr{A}'_{k}[p^\infty])\to D(\mathscr{A}_{k}[p^\infty]) respects the filtration on these W(k)-modules tensor W(k)/p^n W(k) which comes from the deformations \mathscr{A}_n and \mathscr{A}'_n of \mathscr{A}_{k} and \mathscr{A}'_{k} respectively. Upon passing to the limit, and applying Mazur-Messing-Oda, we arrive at the fact that \text{Hom}(A,A') is

\left\{g\in\text{Hom}(\mathscr{A}_{k},\mathscr{A}'_{k}):g_\text{crys}:H^1_\text{crys}(\mathscr{A}'_{k}/W(k))\to H^1_\text{crys}(\mathscr{A}_k/W(k))\text{ preserves filtration}\right\}

which is precisely the first statement.

So, from this we obtain an injection

\text{Hom}(A,A')\otimes_\mathbb{Z}\mathbb{Z}_p \hookrightarrow Hom(\mathscr{A}_k,\mathscr{A}'_k)\otimes_\mathbb{Z}\mathbb{Z}_p

and from Tate’s theorem for \ell=p we know that


and thus we get an injection

\text{Hom}(A,A')\otimes_\mathbb{Z}\mathbb{Z}_p\hookrightarrow \text{Hom}(H^1_\text{crys}(\mathscr{A}'_k/W(k)),H^1_\text{crys}(\mathscr{A}_k/W(k)))

and the image are certainly contained within those that preserve filtration. \blacksquare

This should be viewed as p-adic analogue of the previous section. Namely, we see that we obtain a fully faithful embedding of abelian varieties over K with good reduction into filtered Dieudonné modules given by sending A to the crystalline cohomology of its special fiber with filtration endowed by the deformations \mathscr{A}_n.

Let us note that, unfortunately, the above theorem is as strong as possible. Namely, motivated by the complex theory one might hope that the map

\text{Hom}(A,A')\otimes_\mathbb{Z}\mathbb{Z}_p\to \text{Hom}_{F,V,\text{Fil}}(H^1_\text{crys}(\mathscr{A}'_k/W(k)),H^1_\text{crys}(\mathscr{A}_k/W(k)))

(where, again, the righthand side means preserving Frobenius, Verschiebung, and the Hodge filtration) is an isomorphism—that every filtration preserving map is a \mathbb{Z}_p-linear combination of actual maps A\to A'. Again, unfortunately, this is not the case.

To make this seem more believable we can apply the powerful machine that is p-adic Hodge theory. Namely, the following two non-trivial theorems from the subject the meaning of which I will not explain:

  1. For A/\mathbb{Q}_p an abelian variety with good reduction, there is an isomorphism


    of \mathbb{Z}_p-modules with Frobenis, Verschiebung, and filtration.

  2. The functor D_\text{crys} is fully faithful.

Using these two results one can see that our claimed isomorphism would, in particular, imply that for all abelian varieties A,A'/\mathbb{Q}_p with good reduction that

\text{Hom}(A,A')\otimes_{\mathbb{Z}}\mathbb{Q}_p\to \text{Hom}_{\mathbb{Q}_p[G_{\mathbb{Q}_p}]}(V_p A,V_p A')

is an isomorphism. This is the p-adic version of Tate’s isogeny theorem and, unfortunately, is false. See this Mathoverflow post for some nice discussion.

Now, while this is slightly disappointing it’s somewhat to be expected. We don’t expect what happens over \mathbb{C} to be perfectly mimicked over something like \mathbb{Q}_p. Instead, we content ourselves with an approximate statement, that least our morphisms of abelian varieties (in the good reduction case). Another interesting thing to notice is that unlike in the \ell\ne p it is actually possible that Tate’s isogeny theorem, and thus our desired equality in Theorem 13, do in fact hold. This shows, again, the relative power that can be held in the case of \ell=p versus \ell\ne p as alluded to at the beginning of this post.

That said, if one is willing to put aside the most naive (well, as naive as the above was) guess aside there is an almost perfect analogue of what happens over \mathbb{C} in the p-adic setting. Namely, it makes sense that if we’re going to be thinking p-adically, perhaps what is captured by linear algebra data, is not morphisms of abelian varieties but, instead, morphisms of their associated p-divisible groups since, after all, this is somehow the ‘interesting part’ of an abelian variety over a p-adic field.

Then, in this light, there is an almost perfect analogue of Riemann’s classification of abelian varieties over \mathbb{C} due to (independently I believe) Kisin and Scholze-Weinstein:

Theorem: Let C be an algebraically closed, complete non-archimedean field containing \mathbb{Q}_p. Then, there is an equivalence of categories

\mathsf{BT}_p(\mathcal{O}_C)\xrightarrow{\approx}\left\{(\Lambda,W):\begin{matrix}(1)&\qquad \Lambda\text{ free f.g. }\mathbb{Z}_p\text{-modules}\\(2)&\qquad W\subseteq \Lambda\otimes_{\mathbb{Z}_p}C\text{ a }C\text{-subspace}\end{matrix}\right\}

sending G to (T(G),\text{Lie}(G)\otimes C).

Which provides an almost uncannily exact analogue of Riemann’s result.


  1. Again another very interesting and englightening post! Following your “request” on your homepage, I point out that I think in your definition of X_S(T) you should have “blah over T” rather than “blah over S”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s