Problem. Let F be a finite field, and let F[x] be the ring of polynomials over F. Let f(x) \in F[x]. Show that there exist integers n, m \ge 0 such that f(x) \mid x^m(x^n-1).

Solution. If f is the zero polynomial, then we choose n=m=0. So we may assume that f \ne 0 and write f(x)=x^mg(x), where m \ge 0 is an integer and g(x) \in F[x] with g(0) \ne 0. So we just need to show that g(x) \mid x^n-1 for some positive integer n. There is nothing to prove if g(x) is a constant because then we can choose n whatever we like. So we may assume that \deg g(x) \ge 1. Let \text{char}(F)=p, the characteristic of F, and let \mathbb{F}_{p^r} be the finite field with p^r elements. Then the algebraic closure of F is easily seen to be \overline{F}=\bigcup_{k=1}^{\infty}\mathbb{F}_{p^{k!}} (see here!). Now, we can write

g(x)=c(x-a_1) \cdots (x-a_d),

where 0 \ne c \in F and a_1, \cdots ,a_d \in \overline{F} \setminus \{0\}. Since every non-zero element of a finite field is a root of unity, every non-zero element of \overline{F} is also a root of unity. Thus a_i is a root of unity for all i, and hence there exists a positive integer N such that a_i^N=1 for all i. So each a_i is a root of the polynomial x^N-1 and hence x-a_i \mid x^N-1 for all i. Thus g(x) \mid (x^N-1)^d, and so choosing an integer \ell large enough so that p^{\ell} \ge d, we get

g(x) \mid (x^N-1)^{p^{\ell}}=x^{Np^{\ell}}-1. \ \Box

See the first part of this post here. We are now ready to formally define a semidirect product of groups. As always, \text{Aut}(G) is the group of automorphisms of a group G.

Proposition 1. Let G_1,G_2 be groups, and let \theta: G_2 \to \text{Aut}(G_1) be a group homomorphism. For simplicity, let \theta_{g_2}(g_1):=\theta(g_2)(g_1) for all g_1 \in G_1, \ g_2 \in G_2. Define a multiplication on the set G_1 \times G_2 by

(g_1,g_2)(g_1',g_2')=(g_1\theta_{g_2}(g_1'), g_2g_2'), \ \ \ \ \forall g_1,g_1' \in G_1, \ g_2,g_2' \in G_2.

Then G_1 \times G_2 with the above multiplication is a group called the semidirect product of G_1 by G_2 with respect to \theta, and denoted by G_1 \rtimes_{\theta}G_2.

Proof. Clearly G_1 \times G_2 is closed under the multiplication. We now prove the multiplication is associative. Let g_1,g_1',g_1'' \in G_1, \ g_2,g_2',g_2'' \in G_2. Then

(g_1,g_2)((g_1',g_2')(g_1'',g_2''))=(g_1,g_2)(g_1'\theta_{g_2'}(g_1''),g_2'g_2'')=(g_1\theta_{g_2}(g_1'\theta_{g_2'}(g_1'')),g_2g_2'g_2'')

=(g_1\theta_{g_2}(g_1')\theta_{g_2}\theta_{g_2'}(g_1''),g_2g_2'g_2'')=(g_1\theta_{g_2}(g_1')\theta_{g_2g_2'}(g_1''),g_2g_2'g_2'')

=(g_1\theta_2(g_1'),g_2g_2')(g_1'',g_2'')=((g_1,g_2)(g_1',g_2'))(g_1'',g_2''),

which proves associativity. Next, is to show that (1,1) is the identity element. Let g_1 \in G_1,g_2 \in G_2. Since \theta_{g_2}=\theta(g_2) \in \text{Aut}(G_1), we have \theta_{g_2}(1)=1 and \theta_1=\theta(1)=1_{\text{Aut}(G_1)}=\text{id}_{G_1}, hence \theta_1(g_1)=g_1. Thus (g_1,g_2)(1,1)=(g_1\theta_{g_2}(1),g_2)=(g_1,g_2) and (1,1)(g_1,g_2)=(\theta_1(g_1),g_2)=(g_1,g_2), which proves that (1,1) is the identity element.
Finally, let g_1 \in G_1,g_2 \in G_2. We now find the inverse of (g_1,g_2). So we want to find (x,y) \in G_1 \times G_2 such that

(1,1)=(x,y)(g_1,g_2)=(x\theta_y(g_1),yg_2), \ \ \ \ (1,1)=(g_1,g_2)(x,y)=(g_1\theta_{g_2}(x),g_2y).

Now, the first identity gives y=g_2^{-1} and x=\theta_y(g_1^{-1}), which also satisfy the second identity because

g_1\theta_{g_2}(x)=g_1\theta_{g_2}\theta_y(g_1^{-1})= g_1\theta_{g_2}\theta_{g_2^{-1}}(g_1^{-1})=g_1\theta_1(g_1^{-1})=g_1g_1^{-1}=1.

So we have shown that (g_1,g_2)^{-1}=(\theta_{g_2^{-1}}(g_1^{-1}),g_2^{-1}), and that completes the proof. \Box

Remark. If, in Proposition 1, we choose \theta to be the trivial homomorphism, i.e. if \theta_{g_2}: G_1 \to G_1 is the identity map for all g_2 \in G_2, then G_1 \rtimes_\theta G_2 =G_1 \times G_2, the direct product of G_1,G_2, because, in this case, for all g_1,g_1' \in G_1 and g_2,g_2' \in G_2 we have (g_1,g_2)(g_1',g_2')=(g_1\theta_{g_2}(g_1'),g_2g_2')=(g_1g_1',g_2g_2').

Proposition 2. Let G be a group, and let H,K be subgroups of G. Suppose that H is normal in G and H \cap K=(1). Then HK \cong H \rtimes_{\theta} K, where \theta: K \to \text{Aut}(H) is defined by \theta(k)(h)=khk^{-1} for all h \in H,k \in K.

Proof. We pretty much proved that in the first part of this post. First note that since H is normal, \theta is well-defined. Let \theta_k:=\theta(k), \ k \in K. We show that \theta is a group homomorphism. For k_1,k_2 \in K, \ h \in H, we have

\theta(k_1k_2)(h)=\theta_{k_1k_2}(h)=k_1k_2hk_2^{-1}k_1^{-1}=\theta_{k_1}(k_2hk_2^{-1})=\theta_{k_1}\theta_{k_2}(h)

and hence \theta(k_1k_2)=\theta_{k_1}\theta_{k_2}=\theta(k_1)\theta(k_2), which proves that \theta is a group homomorphism. So, by Proposition 1, we have the group H \rtimes_{\theta}K. Now, define

\phi: HK \to H \times K, \ \ \ \phi(hk)=(h,k), \ \ \ \ \forall h \in H,k \in K,

Since H \cap K=(1), the map \phi is well-defined and bijective. So we only need to show that \phi is a group homomorphism. Let g_1:=h_1k_1, \ g_2=h_2k_2, where h_1,h_2 \in H and k_1,k_2 \in K. Then, remembering that k_1h_2k_1^{-1}=\theta_{k_1}(h_2) \in H, we have

\phi(g_1g_2)=\phi(h_1k_1h_2k_2)=\phi(h_1k_1h_2k_1^{-1}k_1k_2)=(h_1k_1h_2k_1^{-1},k_1k_2)= (h_1\theta_{k_1}(h_2),k_1k_2)

=(h_1,k_1)(h_2,k_2) =\phi(g_1)\phi(g_2). \ \Box

Exercise. Let G_1,G_2 be groups, and let G:=G_1 \rtimes_{\theta}G_2. Let

H:=\{(g_1,1): \ g_1 \in G_1\}, \ \ \ \ \ K:=\{(1,g_2): \ g_2 \in G_2\}.

Show that

i) H \cap K=(1), and G=HK,

ii) H,K are subgroups of G,

iii) H is normal in G.
Hint. For iii), let g_1,g_1' \in G_1, g_2 \in G_2. Using the formula we found, in Proposition 1, for the inverse of an element of G, we have

(g_1,g_2)(g_1',1)(g_1,g_2)^{-1}=(g_1,g_2)(g_1',1)(\theta_{g_2^{-1}}(g_1^{-1}),g_2^{-1})=(g_1\theta_{g_2}(g_1'),g_2)(\theta_{g_2^{-1}}(g_1^{-1}),g_2^{-1})

=(g_1\theta_{g_2}(g_1')\theta_{g_2}\theta_{g_2^{-1}}(g_1^{-1}),1)=(g_1\theta_{g_2}(g_1')\theta_1(g_1^{-1}),1)=(g_1\theta_{g_2}(g_1')g_1^{-1},1) \in H.

A semidirect product of groups is a natural generalization of the direct product of groups. But before defining a semidirect product, which we will formally do in the second part of this post, we need to see where it comes from and how it is a generalization of the direct product. The following is the beginning of the story.

Proposition. Let G be a group, and let H,K be two subgroups of G such that H \cap K=(1).

i) For every g \in HK, there exist unique h \in H,k \in K such that g=hk.

ii) If both H,K are normal in G, and h \in H,k \in K, then hk=kh, and hence HK \cong H \times K.

Proof. i) For g \in HK, there exist h \in H, k \in K such that g=hk. To prove the uniqueness, suppose that h_1k_1=h_2k_2, for some h_1,h_2 \in H, k_1,k_2 \in K. Then x:=h_2^{-1}h_1=k_2k_1^{-1} and so x \in H \cap K=(1), which gives h_1=h_2, \ k_1=k_2.

ii) Since both H,K are normal, we have hkh^{-1}k^{-1}=(hkh^{-1})k^{-1}=h(kh^{-1}k^{-1}) \in H \cap K=(1) and so hk=kh. Now, consider the following map

\phi: HK \to H \times K, \ \ \ \phi(hk)=(h,k), \ \ \ \forall h \in H, k \in K.

We claim that \phi is a group isomorphism. First, \phi is well-defined, by i). It is also clear that \ker \phi=(1) and \phi is onto. So we only need to show that \phi is a group homomorphism. Let g_1,g_2 \in G. So we have g_1=h_1k_1 and g_2:=h_2k_2 for some h_1,h_2 \in H, \ k_1,k_2 \in K, and thus

\phi(g_1g_2)=\phi(h_1k_1h_2k_2)=\phi(h_1h_2k_1k_2)=(h_1h_2,k_1k_2)=(h_1,k_1)(h_2,k_2)

=\phi(h_1k_1)\phi(k_1k_2)=\phi(g_1)\phi(g_2). \ \Box

Remark. In the above Proposition, we had two subgroups H,K inside a group G with the properties that H,K are normal in G and H \cap K=(1). Then we showed that HK \cong H \times K. We can also do that externally in the following sense. Given any two groups G_1,G_2, let G:=G_1 \times G_1. It’s easy to see that H:=G_1 \times (1), \ K:=(1) \times G_2 are normal subgroups of G and G=HK, \ H \cap K=(1).

Motivation for Semidirect Product. Let’s go back to the Proposition, but this time, instead of assuming both H,K are normal, we assume that only one of them, say H, is normal in G. Then HK is still a subgroup of G, and we show that we’ll still have a group isomorphism HK \cong H \times K if we define multiplication on the set H \times K differently. So again we have the map

\phi: HK \to H \times K, \ \ \ \phi(hk)=(h,k), \ \ \ \ \forall h \in H,k \in K,

where for now we look at H \times K as a set not the group direct product H,K. Clearly \phi is still well-defined and bijective, because H \cap K=(1). So we need to see how we should define multiplication on H \times K such that \phi becomes multiplicative, and of course H \times K becomes a group under the multiplication. To see that, let g_1,g_2 \in HK. Then g_1:=h_1k_1 and g_2:=h_2k_2, for some h_1,h_2 \in H, \ k_1,k_2 \in K, and so

g_1g_2=h_1k_1h_2k_2=(h_1k_1h_2k_1^{-1})k_1k_2.

Clearly h_1k_1h_2k_1^{-1} \in H, because H is normal. Thus \phi(g_1g_2)=(h_1k_1h_2k_1^{-1},k_1k_2), and so if we want \phi to be multiplicative, i.e.

(h_1k_1h_2k_1^{-1},k_1k_2)=\phi(g_1g_2)=\phi(g_1)\phi(g_2)=(h_1,k_1)(h_2,k_2),

we must define multiplication in H \times K by

(h_1,k_1)(h_2,k_2)=(h_1k_1h_2k_1^{-1},k_1k_2), \ \ \ \ \ \forall h_1,h_2 \in H, k_1,k_2 \in K, \ \ \ \ \ \ \ (1)

which is quite different from the usual multiplication (h_1,k_1)(h_2,k_2)=(h_1h_2,k_1k_2) in the group direct product H \times K. To see (1) better, given k \in K, let \theta_k : H \to H be the (inner) automorphism of H defined by \theta_k(h)=khk^{-1} (don’t forget that H is normal). So we can re-write (1) as follows

(h_1,k_1)(h_2,k_2)=(h_1 \theta_{k_1}(h_2),k_1k_2), \ \ \ \ \ \forall h_1,h_2 \in H, k_1,k_2 \in K. \ \ \ \ \ \ \ \ \ \ (2)

Note that the map \theta: K \to \text{Aut}(H) defined by \theta(k)=\theta_k is a group homomorphism because for all k_1,k_2 \in K, \ h \in H, we have

\theta(k_1k_2)(h)=\theta_{k_1k_2}(h)=k_1k_2hk_2^{-1}k_1^{-1}=\theta_{k_1}(k_2hk_2^{-1})=\theta_{k_1}\theta_{k_2}(h)

and hence \theta(k_1k_2)=\theta_{k_1}\theta_{k_2}=\theta(k_1)\theta(k_2).
So, assuming the set H \times K with the multiplication defined in (2) is a group, a fact that we will prove in the second part of this post, we will call the group a semidirect product of H by K with respect to \theta, and we will denote the group by H \rtimes_{\theta}K. So we have shown that \phi: HK \to H \rtimes_{\theta}K is a group isomorphism and hence HK \cong H \rtimes_{\theta}K.

Notation. Given a finite abelian group G, we denote by p(G) the product of all the elements of G.

Problem 1. Let G_1, \cdots ,G_n be finite abelian groups, and G:=G_1 \times \cdots \times G_n. Show that

p(G)=(p(G_1)^{|G|/|G_1|}, \cdots , p(G_n)^{|G|/|G_n|}).

Solution. That follows immediately from this simple fact that for any fixed i and x_i \in G_i, there exist exactly |G_1| \cdots |G_{i-1}| |G_{i+1}| \cdots |G_n|=|G|/|G_i| elements of G whose i-th coordinate is x_i. \ \Box

Problem 2. Let G=\langle x \rangle be a cyclic group of order 2^n, \ n \ge 1. Show that p(G)=x^{2^{n-1}}, the unique element of order two in G.

Solution. We have

\displaystyle \begin{aligned}p(G)=\prod_{k=1}^{2^n}x^k=x^{\sum_{k=1}^{2^n}k}=x^{2^n(2^n+1)/2}=x^{2^{n-1}(2^n+1)} =x^{2^{2n-1}}x^{2^{n-1}}=x^{2^{n-1}}.\end{aligned} \ \Box

Problem 3. Let G be a finite abelian group. Show that

i) if |G| is odd, then p(G)=1,

ii) if |G| is even and the Sylow 2-subgroup of G is cyclic, then G has a unique element of order 2, say c, and p(G)=c,

iii) if |G| is even and the Sylow 2-subgroup of G is not cyclic, then p(G)=1.

Solution. i) In this case, G has no element of order two and thus g \ne g^{-1} for all 1 \ne g \in G. Therefore we can write \displaystyle G=\{1\} \cup \{g_1, g_1^{-1}\} \cdots \cup \{g_m,g_m^{-1}\} and so we have \displaystyle p(G)=\prod_{i=1}^mg_ig_i^{-1}=1.

Now suppose that |G| is even. So |G|=2^nm, where n \ge 1 and m is odd. By the fundamental theorem for finite abelian groups, there exist a group H of order m, positive integers k,n_1, \cdots , n_k, and cyclic groups G_i=\langle x_i\rangle, \ |G_i|=2^{n_i}, \ 1 \le i \le k, such that

G=G_1 \times \cdots G_k \times H.

Thus, by Problem 1 and i),

p(G)=(p(G_1)^{|G|/|G_1|}, \cdots , p(G_k)^{|G|/|G_k|},1).

Hence, by Problem 2,

p(G)=(x_1^{|G|/2}, \cdots , x_k^{|G|/2}, 1)=(x_1^{2^{n-1}m}, \cdots , x_k^{2^{n-1}m},1),

and so writing m=2\ell+1, we get

p(G)=(x_1^{2^n\ell}x_1^{2^{n-1}}, \cdots , x_k^{2^n\ell}x_k^{2^{n-1}},1)=(x_1^{2^{n-1}}, \cdots ,x_k^{2^{n-1}},1), \ \ \ \ \ \ \ \ \ (*)

because x_i^{2^{n_i}}=1 and n \ge n_i, which gives x_i^{2^n}=1, for all i. We are now ready to solve the second and the third part of the problem.

ii) In this case, k=1, \ n_1=n and G=G_1 \times H. Thus (*) gives p(G)=(x_1^{2^{n-1}},1)=c, the unique element of order two in G.

iii) In this case, k \ge 2 and thus, since \sum_{i=1}^kn_i=n, we have n_i \le n-1 for all i. Thus x_i^{2^{n-1}}=1 for all i, because o(x_i)=2^{n_i} and n_i \le n-1. Hence, by (*), we have p(G)=(1, \cdots ,1)=1_G. \ \Box

See the first part of this post here.

In Definition 2, ii), in the first part of this post, we defined the distance between two elements v_1,v_2 of an inner product space V as the non-negative real number ||v_1-v_2||. So if W is a subset of V and v \in V, then we say that w \in W is nearest to v (or a best approximation of v in W) if ||v-w|| is minimal, i.e. ||v-w|| \le ||v-x|| for all x \in W. The fourth part of the following theorem uses this concept.

Definition. Let (V, \langle \ , \ \rangle) be an inner product space, and let W be a subspace of V. The orthogonal complement of V is W^{\perp}:=\{v \in V: \ \langle v,w\rangle=0, \ \forall w \in W\}. See that W^{\perp} is a subspace of V.

Theorem. Let (V,\langle \ , \ \rangle) be an inner direct space, and let S=\{e_1, \cdots, e_n\} be an orthonormal subset of V. Let W be the space spanned by S.

i) S is linearly independent.

ii) \displaystyle w=\sum_{k=1}^n\langle w,e_k\rangle e_k for all w \in W.

iii) For every v \in V there exist w \in W, w' \in W^{\perp} such that v=w+w'.

iv) Let v \in V. The element of W nearest to v is \displaystyle z:=\sum_{k=1}^n\langle v,e_k\rangle e_k.

Proof. i) Suppose that \displaystyle \sum_{k=1}^nr_ke_k=0 for some r_k \in \mathbb{R}. Then since S is orthonormal, we have for any 1 \le j \le n,

\displaystyle 0=\left \langle \sum_{k=1}^nr_ke_k,e_j \right \rangle =\sum_{k=1}^nr_k \langle e_k,e_j\rangle=r_j\langle e_j,e_j\rangle=r_j||e_j||^2=r_j.

ii) Since S is a basis for W, we can write \displaystyle w=\sum_{k=1}^nr_ke_k for some r_k \in \mathbb{R}. Thus, since S is orthonormal, we have for any 1 \le j \le n,

\displaystyle \langle w,e_j\rangle=\left \langle \sum_{k=1}^nr_ke_k,e_j \right \rangle =\sum_{k=1}^nr_k \langle e_k,e_j\rangle=r_j\langle e_j,e_j\rangle=r_j||e_j||^2=r_j.

iii) Let v \in V and put \displaystyle w:=\sum_{k=1}^n\langle v,e_k\rangle e_k \in W. Then for any 1 \le j \le n, we have

\displaystyle \langle v-w, e_j\rangle=\langle v,e_j\rangle -\langle w,e_j\rangle =\langle v,e_j\rangle-\sum_{k=1}^n\langle v,e_k\rangle \langle e_k,e_j\rangle

=\langle v,e_j\rangle-\langle v,e_j\rangle \langle e_j,e_j\rangle=\langle v,e_j\rangle-\langle v,e_j\rangle=0.

Thus \langle v-w, x\rangle=0 for all x \in W and so, by definition, w':=v-w \in W^{\perp}. Hence v=w+w'.

iv) We need to show that ||v-z|| \le ||v-x|| for all x \in W. By iii), there exist w \in W and w' \in W^{\perp} such that v=w+w'. By ii), w=\sum_{k=1}^n\langle w,e_k\rangle e_k and, on the other hand, for any 1 \le k \le n,

\langle v,e_k\rangle =\langle w+w',e_k\rangle=\langle w,e_k\rangle + \langle w',e_k\rangle=\langle w,e_k\rangle.

Thus w=\sum_{k=1}^n\langle w,e_k\rangle e_k=\sum_{k=1}^n\langle v,e_k\rangle e_k=z. So v=z+w', and hence if x \in W, then

||v-x||^2=||z-x+w'||^2=\langle z-x+w',z-x+w'\rangle =\langle z-x,z-x\rangle+\langle w',w'\rangle

=||z-x||^2+||w'||^2 \ge ||w'||^2=||v-z||^2,

and so ||v-z|| \le ||v-x||. \ \Box

Example (Fourier series). Let (V, \langle \ , \ \rangle) be the inner product space of all the continuous functions f: [0,2\pi] \to \mathbb{R} with \langle \ , \rangle defined by

\displaystyle \langle f,g \rangle =\int_0^{2\pi}f(x)g(x) \ dx, \ \ \ \ \forall f,g \in V.

Now, given integer n \ge 1, consider the set

\displaystyle S_n:=\{f_0(x), f_1(x), g_1(x), f_2(x), g_2(x), \cdots, f_n(x),g_n(x)\},

where

\displaystyle f_0(x)=\frac{1}{\sqrt{2\pi}}, \ \ f_k(x)=\frac{1}{\sqrt{\pi}}\cos(kx), \ \ g_k(x)=\frac{1}{\sqrt{\pi}}\sin(kx), \ \ \ 1 \le k \le n.

By Example 3 in the first part of this post, S is orthonormal. Let W_n be the subspace of V spanned by S_n, and let f \in V. By the fourth part of the above theorem, the element of W_n nearest to f (in other words, the best approximation of f in W_n), is

\displaystyle h_n:=\langle f,f_0\rangle f_0+\sum_{k=1}^n \langle f,f_k\rangle f_k+\sum_{k=1}^n\langle f,g_k\rangle g_k

and so

\displaystyle h_n(x)=\frac{1}{2\pi}\int_0^{2\pi}f(x) \ dx+\sum_{k=1}^n\left(\frac{1}{\pi}\int_0^{2\pi}f(x)\cos(kx) \ dx\right)\cos(kx) + \sum_{k=1}^n\left(\frac{1}{\pi}\int_0^{2\pi}f(x)\sin(kx) \ dx\right)\sin(kx).

Writing that in the standard form, we have

\displaystyle h_n(x)=\frac{1}{2}a_0+\sum_{k=1}^n(a_k\cos(kx)+b_k\sin(kx)),

where \displaystyle a_0=\frac{1}{\pi}\int_0^{2\pi}f(x) \ dx, and

\displaystyle \begin{aligned}a_k=\frac{1}{\pi}\int_0^{2\pi}f(x)\cos(kx) \ dx, \ \ \ \ b_k=\frac{1}{\pi}\int_0^{2\pi}f(x)\sin(kx) \ dx, \ \ \ 1 \le k \le n.\end{aligned}

It is known that if f is infinitely differentiable, then the sequence \{h_n\} converges to f and so we get

\displaystyle f(x)=\frac{1}{2}a_0+\sum_{k=1}^{\infty}(a_k\cos(kx)+b_k\sin(kx)),

which is the Fourier series representation of f.

Throughout this post, V is a vector space over \mathbb{R}.

In this post, we are going to see the linear algebra leading to the Fourier series representations for continuous functions [0,2\pi] \to \mathbb{R}. The reference for this post is the first 2 chapters of Further Linear Algebra by Blyth & Robertson.

Definition 1. An inner product on V is a map \langle \ , \ \rangle: V \times V \to \mathbb{R} that satisfies the following properties for all v, v_1,v_2 \in V, \ r \in \mathbb{R}.

i) \langle v_1,v_2 \rangle =\langle v_2,v_1 \rangle,

ii) \langle rv_1+v_2,v \rangle =r\langle v_1,v\rangle + \langle v_2,v\rangle,

iii) \langle v,v \rangle \ge 0, and \langle v,v\rangle=0 if and only if v=0.

We then say that (V,\langle \ , \ \rangle) is an inner product space.

Remark 1. i) The second condition in Definition 1 is equivalent to the two conditions

\langle v_1+v_2,v\rangle=\langle v_1,v \rangle+\langle v_2,v\rangle, \ \ \ \langle rv_1,v\rangle=r\langle v_1,v\rangle.

ii) We have \langle 0,v\rangle=0 for all v \in V because \langle 0,v\rangle=\langle 0+0,v\rangle=\langle 0,v\rangle+\langle 0,v\rangle.

iii) We have \langle r_1v_1,r_2v_2 \rangle=r_1r_2 \langle v_1,v_2\rangle, for all r_1,r_2 \in \mathbb{R}, \ v_1,v_2 \in V because

\langle r_1v_1,r_2v_2\rangle=r_1\langle v_1,r_2v_2\rangle=r_1\langle r_2v_2,v_1\rangle=r_1r_2\langle v_2,v_1\rangle=r_1r_2\langle v_1,v_2\rangle.

Example 1. Given an integer n, let V:=\mathbb{R}^n, the vector space of n-tuples of real numbers. Define the map \langle \ , \ \rangle: V \times V \to \mathbb{R} as follows: if v_1=(a_1, \cdots, a_n) \in V, \ v_2=(b_1, \cdots,b_n) \in V, then

\displaystyle \langle v_1,v_2\rangle =\sum_{i=1}^na_ib_i.

That is easily seen to be an inner product on V. In vector calculus, for n=2,3, this inner product is better known as dot product.

Example 2. Given real numbers a < b, let V be the vector space of all continuous functions f: [a,b] \to \mathbb{R}. Define the map \langle \ , \ \rangle: V \times V \to \mathbb{R} by

\displaystyle \langle f,g \rangle =\int_a^bf(x)g(x) \ dx,

for all f,g \in V. Again, it’s easy to see that the map is an inner product on V.

Definition 2. Let (V,\langle \ , \ \rangle) be an inner product space.

i) The norm of v \in V is defined to be ||v||=\sqrt{\langle v,v \rangle}. Note that this definition makes sense by the third property in Definition 1.

ii) The distance between two elements v_1,v_2 \in V is defined to be ||v_1-v_2||.

Remark 2. Let (V,\langle \ , \ \rangle) be an inner product space. By Definitions 1, 2, ||v||=0 if and only if v=0. Also, if r \in \mathbb{R}, \ v \in V, then

||rv||=\sqrt{\langle rv,rv\rangle}=\sqrt{r^2\langle v,v \rangle}=|r|\sqrt{\langle v,v \rangle}=|r|||v||.

In particular, if v \ne 0, and \displaystyle w:=\frac{v}{||v||}, then ||w||=1.

Definition 3. Let (V,\langle \ , \ \rangle) be an inner product space.

i) We say that v_1,v_2 \in V are orthogonal if \langle v_1,v_2 \rangle=0. A subset S of V is said to be orthogonal if every two distinct elements of S are orthogonal.

ii) A subset S of V is said to be orthonormal if it is orthogonal and ||v||=1 for all v \in S.

Remark 3. Let (V,\langle \ , \ \rangle) be an inner product space. By Remark 1, if S is an orthogonal subset of V and 0 \notin S, then the set \displaystyle \left \{\frac{v}{||v||}: \ \ v \in S\right\} is an orthonormal subset of V.

Example 3. Let V be the vector space of all continuous functions f: [0,2\pi] \to \mathbb{R}, and let \langle \ , \rangle be the inner product on V introduced in Example 2. Now, consider the following subset of V

\displaystyle S:=\{f_0(x), f_1(x), g_1(x), f_2(x), g_2(x), \cdots \},

where

\displaystyle f_0(x)=\frac{1}{\sqrt{2\pi}}, \ \ f_k(x)=\frac{1}{\sqrt{\pi}}\cos(kx), \ \ g_k(x)=\frac{1}{\sqrt{\pi}}\sin(kx), \ \ \ k \ge 1.

Then S is an orthonormal subset of V.

Proof. By Definition 2, i), we have

\displaystyle ||f_0||=\sqrt{\langle f_0,f_0 \rangle}=\sqrt{\int_0^{2\pi}(f_0(x))^2dx}=\sqrt{\int_0^{2\pi}\frac{1}{2\pi} \ dx}=1,

\displaystyle ||f_k||=\sqrt{\langle f_k,f_k \rangle}=\sqrt{\int_0^{2\pi}(f_k(x))^2dx}=\sqrt{\int_0^{2\pi}\frac{1}{\pi}\cos^2(kx) \ dx}=1,

and similarly ||g_k||=1, for all k \ge 1. So, in order to show that S is orthonormal, we only need to prove that it’s orthogonal. For any j,k \ge 1 with j \ne k, we have

\displaystyle \langle f_0,f_k \rangle=\int_0^{2\pi}f_0(x)f_k(x) \ dx=\int_0^{2\pi}\frac{1}{\sqrt{2}\pi}\cos(kx) \ dx=0,

\displaystyle \langle f_0,g_k \rangle=\int_0^{2\pi}f_0(x)g_k(x) \ dx=\int_0^{2\pi}\frac{1}{\sqrt{2}\pi}\sin(kx) \ dx=0,

\displaystyle \langle f_j,f_k \rangle=\int_0^{2\pi}f_j(x)f_k(x) \ dx=\int_0^{2\pi}\frac{1}{\pi}\cos(jx)\cos(kx) \ dx=0,

\displaystyle \langle g_j,g_k \rangle=\int_0^{2\pi}g_j(x)g_k(x) \ dx=\int_0^{2\pi}\frac{1}{\pi}\sin(jx)\sin(kx) \ dx=0.

Finally, see that for any j,k \ge 1 we have \langle f_j,g_k \rangle=0. \ \Box

Exercise 1. Let (V,\langle \ , \ \rangle) be an inner product space. Show that for any v_1,v_2 \in V,

i) |\langle v_1,v_2\rangle | \le ||v_1|| ||v_2||, with equality if and only if v_1,v_2 are linearly dependent,

ii) ||v_1+v_2|| \le ||v_1||+||v_2||.

Hint. For i), which is known as the Cauchy-Schwarz inequality, there’s nothing to prove if v_1 =0. For v_1 \ne 0, let \displaystyle v:=v_2-\frac{\langle v_1,v_2\rangle}{||v_1||^2}v_1 and use the third property in Definition 1.

Exercise 2. Apply the inequalities i), ii), in Exercise 1 to the inner product spaces in Examples 1,2.

See the second part of this post here.

Here we defined group actions, and we proved the orbit-stabilizer theorem here, which is the theorem we are going to use in this post to solve two problems.

In the following problem, \text{GL}(n,\mathbb{Z}_p) is, as usual, the multiplicative group of n \times n invertible matrices with entries from the field \mathbb{Z}_p.

Problem 1. Let p,q be two prime numbers, and Suppose that G is a q-subgroup of \text{GL}(n,\mathbb{Z}_p), \ n \ge 2. Show that if q \nmid p^n-1, then there exists a non-zero vector \bold{x} \in \mathbb{Z}_p^n such that A\bold{x}=\bold{x} for all A \in G.

Solution. So |G|=q^m for some integer m \ge 1. Clearly G acts on the set X:=\mathbb{Z}_p^n by the usual matrix multiplication, i.e.

A \cdot \bold{x}:=A\bold{x}, \ \ \ A \in G, \ \ \bold{x} \in X.

By the orbit-stabilizer theorem, we have

|\text{orb}(\bold{x})||\text{stab}(\bold{x})|=|G|=q^m, \ \ \ \ \ \ \ \ \ (1)

for all \bold{x} \in X. Let \text{orb}(x_0), \cdots , \text{orb}(x_k) be all the distinct orbits of X, where we assume that \bold{x}_0=\bold{0} (and so \bold{x}_i \ne 0 for all 1 \le i \le k). Then

\displaystyle p^n=|X|=\sum_{i=0}^k|\text{orb}(\bold{x}_i)|=1+\sum_{i=1}^k|\text{orb}(\bold{x}_i)|,

which gives

\displaystyle \sum_{i=1}^k|\text{orb}(\bold{x}_i)|=p^n-1. \ \ \ \ \ \ \ \ \ \ (2)

From (1) we get that if \bold{x} \in X, then either |\text{orb}(\bold{x})|=1 or q \mid  |\text{orb}(\bold{x})|. From (2), we see that we can’t have q \mid |\text{orb}(\bold{x}_i)| for all 1 \le i \le n, because then q \mid p^n-1, contradicting q \nmid p^n-1. Thus there exists 1 \le j \le k such that |\text{orb}(\bold{x}_j)|=1, and so |\text{stab}(\bold{x}_j)|=|G|, by (1). Hence \text{stab}(\bold{x}_j)=G, which means that A\bold{x}_j=\bold{x}_j for all A \in G. \ \Box

We proved here (Fact 5) that every nilpotent group G \ne (1) satisfies the so-called normalizer condition, i.e. if H is a proper subgroup of G, then H is a proper subgroup of N_G(H), the normalizer of H in G. We showed in this post (Lemma 1) that every finite p-group is nilpotent, and hence every finite p-group satisfies the normalizer condition. We’re now going to use group actions to prove this result.

Problem 2. Let G be a finite p-group, and let H be a proper subgroup of G. Show that H \neq N_G(H).

Solution. Let |G|=p^n. If H=(1), then N_G(H)=G \ne H. Suppose now that H \ne (1). So |H|=p^m for some integer 1 \le m < n. Let X be the set of all distinct left cosets of H in G. Clearly H acts on X by left multiplication, i.e. h \cdot gH=hgH, for all h \in H, \ gH \in X. Let \text{orb}(g_1H), \cdots, \text{orb}(g_kH) be all the distinct orbits of X, where g_1=1 (and so g_i \notin H for all 2 \le i \le k). Clearly |\text{orb}(g_1H)|=1, because \text{orb}(g_1H)=\text{orb}(H)=\{hH: \ h \in H\}=\{H\}, and so

\displaystyle p^{n-m}=[G:H]=|X|=\sum_{i=1}^k|\text{orb}(g_iH)|=1+\sum_{i=2}^k |\text{orb}(g_iH)|. \ \ \ \ \ \ \ \ (*)

Now, |\text{orb}(g_iH)| \mid |H|=p^m, for all i, and so either |\text{orb}(g_iH)|=1 or p \mid |\text{orb}(g_iH)|. Since p \mid p^{n-m}, we get from (*) that we can’t have p \mid |\text{orb}(g_iH)| for all 2 \le i \le k, and hence |\text{orb}(g_jH)|=1 for some 2 \le j \le k. Since |\text{orb}(g_jH)|=1, we have |\text{stab}(g_jH)|=|H|, by the orbit-stabilizer theorem, and hence H=\text{stab}(g_jH)=\{h \in H: \ hg_jH=g_jH\}. Therefore g_j^{-1}hg_j \in H for all h \in H, and thus g_j \in N_G(H). So we have shown that g_j \in N_G(H) \setminus H, which completes the solution. \Box

Throughout this post, N_G(H) is the normalizer of a subgroup H of a group G.

I already have four posts on nilpotent groups in this blog; you can see them here: (1), (2), (3), (4). In this post, the focus is on finite nilpotent groups. As you are going to see, asking how much we know about finite nilpotent groups is the same as asking how much we know about finite p-groups because finite nilpotent groups are nothing but direct products of finite p-groups.

Lemma 1. i) Every finite p-group is nilpotent.

ii) Every finite direct product of finite p-groups is a nilpotent group.

Proof. i) Let G be a finite p-group. The proof is by induction over |G|. If |G|=p, then G is abelian hence nilpotent. Now, for the general case, since G is a finite p-group, Z(G) is non-trivial, and hence G/Z(G) is a finite p-group with |G/Z(G)| < |G|. Thus, by induction, G/Z(G) is nilpotent. Therefore, by Fact 3, i), in this post, G is nilpotent.

ii) It follows from i) and Fact 4 in this post. \Box

Lemma 2 (Frattini Argument). Let G be a finite group, and let P be a Sylow p-subgroup of G. If H is a normal subgroup of G and P \subseteq H, then G=HN_G(P).

Proof. Note that P is clearly a Sylow p-subgroup of H. Now, let g \in G. Then gPg^{-1} \subseteq gHg^{-1}=H, because H is normal. So gPg^{-1} \subseteq H, and hence both P, gPg^{-1} are Sylow p-subgroups of H. Thus, by Sylow theorems, P,gPg^{-1} are conjugate in H, i.e. gPg^{-1}=hPh^{-1}, for some h \in H. So we get that h^{-1}gP(h^{-1}g)^{-1}=P, i.e. h^{-1}g \in N_G(P), which gives g \in hN_G(P) \subseteq HN_G(P). So G \subseteq HN_G(P) and the result follows. \Box

Theorem. Let G be a finite group. Let G', \Phi(G) be, respectively, the commutator subgroup and the Frattini subgroup of G. The following statements are equivalent.

i) G is nilpotent.

ii) G satisfies the normalizer condition.

iii) Every maximal subgroup of G is normal.

iv) G' \subseteq \Phi(G), where \Phi(G) is the Frattini subgroup of G.

v) Every Sylow subgroup of G is normal.

vi) G is isomorphic to a direct product of its Sylow subgroups.

vii) Every two elements of G of coprime order commute.

Proof. i) \implies ii). This is true for any nilpotent group, not just finite ones. See Fact 5 in this post.

ii) \implies iii). Let M be a maximal subgroup of G. Since G satisfies the normalizer condition, M \subset N_G(M) and hence, since M is maximal, N_G(M)=G, i.e. M is normal in G.

iii) \implies iv). Let M be a maximal subgroup of G. Then G/M is a group with no non-trivial proper subgroup and hence it is a cyclic group (of prime order). In particular, G/M is abelian and so G' \subseteq M. The result now follows since, by definition, \Phi(G) is the intersection of all maximal subgroups of G.

iv) \implies v). Let P be a Sylow subgroup of G and suppose, to the contrary, that P is not normal in G. Then N_G(P) \ne G and so N_G(P) \subseteq M for some maximal subgroup of G. Now, M is normal in G because G' \subseteq \Phi(G) \subseteq M, and so, by Lemma 2, G=MN_G(P)=M, contradiction.

v) \implies vi). Let p_1, \cdots, p_n be all the distinct prime divisors of |G|, and let P_i, \ 1 \le i \le n, be a Sylow p_i-subgroup. Since each P_i is normal and P_i \cap P_j=(1), for all i \ne j, because \gcd(|P_i|,|P_j|)=1, we have |P_1P_2 \cdots P_n|=|P_1||P_2| \cdots |P_n|=|G|, and so P_1P_2 \cdots P_n=G. Now, for each i, let

Q_i:=P_1 \cdots P_{i-1}P_{i+1} \cdots P_n.

Then |Q_i|=|P_1| \cdots |P_{i-1}||P_{i+1}| \cdots |P_n| and so \gcd(|P_i|,|Q_i|)=1, which gives P_i \cap Q_i=(1). Thus

G=P_1 P_2 \cdots P_n \cong P_1 \times P_2 \times \cdots \times P_n.

vi) \implies vii). We may assume that G=P_1 \times \cdots \times P_n, where each P_i is a p_i-group for some prime p_i, and p_i \ne p_j for i \ne j. Let x:=(x_1, \cdots , x_n), \ y:=(y_1, \cdots ,y_n) be two elements of coprime order in G. Note that o(x)=o(x_1) \cdots o(x_n), \ o(y)=o(y_1) \cdots o(y_n) because \gcd(|P_i|,|P_j|)=1 for i \ne j. So now if x_i \ne 1, \ y_i \ne 1, for some i, then p_i \mid o(x_i), \ p_i \mid o(y_i), and hence p_i \mid \gcd(o(x),o(y))=1, which is non-sense. Thus for each i, either x_i=1 or y_i=1 and hence xy=yx.

vii) \implies vi). Let |G|=\prod_{i=1}^np_i^{k_i}, the prime factorization of |G|. For each prime p_i, let P_i be a Sylow p_i-subgroup of G, and put H:=P_1P_2 \cdots P_n. Since \gcd(|P_i|,|P_j|)=1, for all i \ne j, we have ab=ba for all a \in P_i, \ b \in P_j, \ i \ne j. Thus if x:=x_1x_2 \cdots x_n, \ y:=y_1y_2 \cdots y_n, \ \ x_i, y_i \in P_i, are two elements of H, then xy=x_1y_1x_2y_2 \cdots x_ny_n and so H is a subgroup of G. Now consider the following map

f: H \to P_1 \times P_2 \times \cdots \times P_n, \ \ \ \ \ f(x_1x_2 \cdots x_n)=(x_1, x_2, \cdots ,x_n).

Note that f is well-defined because if x_1x_2 \cdots x_n=1, and n_i:=|G|/p_i^{k_i}, then

1=(x_1x_2 \cdots x_n)^{m_i}=x_1^{m_i}x_2^{m_i} \cdots x_n^{m_i}=x_i^{m_i}=x_i^{p_i^{k_i}},

which gives x_i=1 because \gcd(p_i, m_i)=1. It is clear that f is a bijective group homomorphism, and hence H \cong P_1 \times P_2 \times \cdots \times P_n. The result now follows because

|H|=|P_1 \times P_2 \times \cdots \times P_n|=|P_1||P_2| \cdots |P_n|=|G|,

which gives H=G.

vi) \implies i). Clear by Lemma 1, ii). \Box

Note. The main reference for this post is the book The Theory of Nilpotent Groups by Clement, Majewicz, and Zyman. Regarding the above theorem, I should mention here that unfortunately the book (see page 49) gives a false proof of “vii) \implies vi)”. The authors first “prove” that every Sylow subgroup of G is normal; here’s their argument: if P is a Sylow p-subgroup of G and g \in G, then either g \in P, in which case gPg^{-1}=P, or o(g) and |P| are coprime, in which case g commutes with every element of P and again gPg^{-1}=P. The problem with their argument is that they assume unknowingly that P is the unique Sylow p-subgroup of G, which is not given. What if g is in a Sylow p-subgroup different from P? In fact, if P was unique, then P would be normal, by Sylow theorems, and so the condition vii) would be redundant.

Throughout this post, \{F_n\} is the sequence of Fibonacci numbers defined as follows

F_0=0, \ F_1=1, \ \ \ F_{n+1}=F_n+F_{n-1}, \ \ \ n \ge 1.

We first use basic linear algebra to prove two well-known identities on the sequence of Fibonacci numbers, and then we use those identities to find a formula for product of five consecutive elements of the sequence.

Cassini’s identity. F_n^2-F_{n-1}F_{n+1}=(-1)^{n-1}, for all n \ge 1.

Proof. Consider the following sequence of 2 \times 2 matrices

A_n:=\begin{pmatrix} F_n & F_{n-1} \\ F_{n+1} & F_n\end{pmatrix}, \ \ \ \ \ n \ge 1.

Note that \det A_n=F_n^2-F_{n-1}F_{n+1}, which we want to prove equals (-1)^{n-1}. Now, we have

\begin{aligned}A_{n+1}=\begin{pmatrix} F_{n+1} & F_n \\ F_{n+2} & F_{n+1}\end{pmatrix}=\begin{pmatrix}F_n+F_{n-1} & F_n \\ F_{n+1}+F_n & F_{n+1}\end{pmatrix}=\begin{pmatrix} F_n & F_{n-1} \\ F_{n+1} & F_n\end{pmatrix}\begin{pmatrix} 1 & 1 \\ 1 & 0\end{pmatrix}=A_n\begin{pmatrix} 1 & 1 \\ 1 & 0\end{pmatrix}\end{aligned}

and so

A_{n+1}=A_{n-1}\begin{pmatrix} 1 & 1 \\ 1 & 0\end{pmatrix}^2=A_{n-2}\begin{pmatrix} 1 & 1 \\ 1 & 0\end{pmatrix}^3= \cdots =A_1\begin{pmatrix} 1 & 1 \\ 1 & 0\end{pmatrix}^n=\begin{pmatrix}1 & 0 \\ 1 & 1\end{pmatrix}\begin{pmatrix} 1 & 1 \\ 1 & 0\end{pmatrix}^n.

Hence

\det A_{n+1}=\det \begin{pmatrix}1 & 0 \\ 1 & 1\end{pmatrix}\det \begin{pmatrix} 1 & 1 \\ 1 & 0\end{pmatrix}^n=\left(\det \begin{pmatrix} 1 & 1 \\ 1 & 0\end{pmatrix}\right)^n=(-1)^n,

and so \det A_n=(-1)^{n-1}. \ \Box

Catalan’s identity (a special case). F_n^2-F_{n-2}F_{n+2}=(-1)^n, for all n \ge 2.

Proof. Consider the following sequence of 2 \times 2 matrices

B_n:=\begin{pmatrix} F_n & F_{n-2} \\ F_{n+2} & F_n\end{pmatrix}, \ \ \ \ \ n \ge 2.

Note that \det B_n=F_n^2-F_{n-2}F_{n+2}, which we want to prove equals (-1)^n. Now, we have

\begin{aligned}B_{n+1}=\begin{pmatrix} F_{n+1} & F_{n-1} \\ F_{n+3} & F_{n+1}\end{pmatrix}=\begin{pmatrix}F_n+F_{n-1} & F_n-F_{n-2} \\ F_{n+2}+F_{n+1} & F_{n+2}-F_n\end{pmatrix}=\begin{pmatrix} 2F_n -F_{n-2}& F_n-F_{n-2} \\ 2F_{n+2}-F_n & F_{n+2}-F_n\end{pmatrix}\end{aligned}

=\begin{pmatrix}F_n & F_{n-2} \\ F_{n+2} & F_n\end{pmatrix}\begin{pmatrix}2 & 1 \\ -1 & -1\end{pmatrix}=B_n\begin{pmatrix}2 & 1 \\ -1 & -1\end{pmatrix}

and so

\begin{aligned}B_{n+1}=B_{n-1}\begin{pmatrix}2 & 1 \\ -1 & -1\end{pmatrix}^2=B_{n-2}\begin{pmatrix}2 & 1 \\ -1 & -1\end{pmatrix}^3= \cdots =B_2\begin{pmatrix}2 & 1 \\ -1 & -1\end{pmatrix}^{n-1}=\begin{pmatrix}1 & 0 \\ 3 & 1\end{pmatrix}\begin{pmatrix}2 & 1 \\ -1 & -1\end{pmatrix}^{n-1}.\end{aligned}

Hence

\det B_{n+1}=\det \begin{pmatrix}1 & 0 \\ 3 & 1\end{pmatrix}\det \begin{pmatrix} 2 & 1 \\ -1 & -1\end{pmatrix}^{n-1}=\left(\det \begin{pmatrix} 2 & 1 \\ -1 & -1\end{pmatrix}\right)^{n-1}=(-1)^{n-1},

and so \det B_n=(-1)^{n-2}=(-1)^n. \ \Box

We now find formulas for product of three and five consecutive Fibonacci numbers in terms of the number in the middle.

Corollary. i) F_{n-1}F_nF_{n+1}=F_n^3+(-1)^nF_n, for all n \ge 1.

ii) F_{n-2}F_{n-1}F_nF_{n+1}F_{n+2}=F_n^5-F_n, for all n \ge 2.

Proof. i) By Cassini’s identity,

F_{n-1}F_nF_{n+1}=F_{n-1}F_{n+1}F_n=(F_n^2-(-1)^{n-1})F_n=F_n^3+(-1)^nF_n.

ii) Using both Cassini’s and Catalan’s identities, we have

F_{n-2}F_{n-1}F_nF_{n+1}F_{n+2}=(F_{n-2}F_{n+2})(F_{n-1}F_{n+1})F_n=(F_n^2-(-1)^n)(F_n^2+(-1)^n)F_n

=(F_n^4-1)F_n=F_n^5-F_n. \ \Box

Throughout, all abelian groups are considered additively and assumed to be non-zero.

In this post, we answer this question: precisely, which abelian groups can be turned into a vector space over some field? To answer this question, we first need to recall a couple of definitions.

  • Given a prime number p, an abelian group G is said to be an elementary pgroup if px=0 for all x \in G.
  • An element of a group is called torsion if it has a finite order. A group is called torsion-free if the only element of the group that has finite order is the identity element.
  • An abelian group G is said to be divisible if for every integer n \ne 0 and every y \in G, there exists x \in G such that nx=y.

Remark. If we consider G multiplicatively, instead of additively, then G is an elementary p-group if x^p=1 for all x \ G. Similarly, G is torsion-free if x^n=1 implies x=1 for all integers n \ge 1. Finally, G is divisible if for every y \in G and every integer n \ne 0, there exists x \in G such that x^n=y.

Theorem. Let G be an abelian group. Then G is a vector space over some field if and only if G is either an elementary p-group or a torsion-free divisible group.

Proof. Suppose first that G is a vector space over some field F and denote the corresponding scalar multiplication by *. Note that for any integer n > 0 we have

\begin{aligned}(n1_F)*x=(\underbrace{1_F+ \cdots +1_F}_{n \ \text{times}})*x=\underbrace{1_F*x+ \cdots +1_F*x}_{k \ \text{times}}=\underbrace{x+ \cdots +x}_{n \ \text{times}}=nx. \ \ \ \ \  (\dagger)\end{aligned}

We now consider two cases.

Case 1: The characteristic of F is non-zero. So the characteristic of F is some prime number p, and thus, by (\dagger), for all x \in G, we have px=(p1_F)*x=0*x=0. Hence G is an elementary p-group.

Case 2: The characteristic of F is zero. So \mathbb{Q} \subseteq F and hence 1/n \in F for all non-zero integers n. So if n > 0 is an integer and y \in G, \ x:=(1/n)*y \in G, then, by (\dagger), we have

nx=n*x=n*((1/n)*y)=1*y=y.

The same argument works for 0 > n \in \mathbb{Z} if we just change n,x to -n,-x. So we’ve shown that G is divisible. It is clear that G is torsion-free because if nx=0 for some integer n > 0 and x \in G, then, by (\dagger), \ n*x=nx=0 and so multiplying by 1/n gives x=0.

Conversely, suppose now that G is either an elementary p-group or a torsion-free divisible group. We need to show that G is a vector space over some field. So we consider two cases.

Case 3: G is an elementary p-group. Let F:=\mathbb{Z}_p. We claim that G is a vector space over F. Note that F=\{k1_F: \ k \in \mathbb{Z}\}. Define the scalar multiplication * by (k1_F)*x:=kx, for all k \in \mathbb{Z} and x \in G. We show that * is well-defined, i.e. if k1_F=m1_F, for some integers k,m, then (k1_F)*x=(m1_F)*x. Well, since (k-m)1_F=0, we have p \mid k-m, and so (k-m)x=0, because G is an elementary p-group. Thus

k1_F*x=kx=(k-m)x+my=mx=m1_F*x.

It is very straightforward to see that * satisfies all the required vector space axioms.

Case 4: G is a torsion-free divisible group. Let F:=\mathbb{Q}. We claim that G is a vector space over F. Since G is divisible, for any non-zero integer n and y \in G, there exists x \in G such that nx=y. In fact, since G is torsion-free, x is unique because if nx'=nx for some x' \in G, then n(x-x')=0 and hence x=x'. We now define the scalar multiplication *. For any m/n \in F, \ m,n \in \mathbb{Z}, \ n \ne 0, and any y \in G, we define (m/n)*y:=x, where x is the unique element of G satisfying nx=my. We show that * is well-defined. Suppose that m/n=m'/n' and x,x' \in G are such that nx=my, n'x'=m'y. Note that if m=0, then m'=0 and so nx=n'x'=0, which give x=x'=0. Suppose now that m \ne 0. Then

0=mm'y-m'my=mn'x'-m'nx=mn'x'-mn'x=mn'(x-x'),

which gives x=x' because mn' \ne 0. Finally, proving that * satisfies all the required vector space axioms is straightforward and so I leave it as an exercise. \Box

Example 1. The group of integers \mathbb{Z}, which is torsion-free, is neither divisible nor an elementary p-group, and so there is no field over which \mathbb{Z} is a vector space.

Example 2. The group \mathbb{Q}/\mathbb{Z}, which is divisible, has no vector space structure because clearly it’s neither an elementary p-group nor torsion-free; in fact the group is torsion, i.e. every element of the group has a finite order (see this post for more about the group!).

Example 3. Let G be a finite abelian group. Then G is torsion, because it’s finite, and so it has a vector space structure if and only if it is an elementary p-group. Hence, by the fundamental theorem for abelian groups, G is a vector space if and only if G \cong \mathbb{Z}_p \times \cdots \times \mathbb{Z}_p.

Example 4. The group (\mathbb{C},+) is clearly torsion-free and divisible hence a vector space over \mathbb{Q}. But \mathbb{C}^{\times}, the multiplicative group of \mathbb{C}, is neither an elementary p-group nor torsion-free (see the Remark before the Theorem) and hence not a vector space over any field. Note that \mathbb{C}^{\times} is divisible.