In this post, All matrices are assumed to be in M_n(\mathbb{C}), the ring of n \times n matrices with complex entries. Also, e_{ij} \in M_n(\mathbb{C}) is the matrix with (i,j)-entry 1 and all other entries 0.

Recall that a matrix A=[a_{ij}] is said to be upper triangular if a_{ij}=0 for all i > j. Similarly, A is lower triangular if a_{ij}=0 for all i < j. If A is either upper triangular or lower triangular, then A is said to be triangular. If A is triangular and also a_{ii}=0 for all i, then we say that A is strictly triangular.

Problem 1. Let A=[a_{ij}] be an n \times n matrix with the adjugate \text{adj}(A). Show that

i) if A is strictly upper triangular, then \text{adj}(A)=(-1)^{n+1}a_{12}a_{23} \cdots a_{n-1,n}e_{1n}

ii) if A is strictly lower triangular, then \text{adj}(A)=(-1)^{n+1}a_{21}a_{32} \cdots a_{n,n-1}e_{n1}.

Solution. I only prove i) and leave the proof of ii), which is similar to the proof of i), as an easy exercise.
Let M_{ij} be the (i,j)-minor of A, i.e. the (n-1) \times (n-1) matrix that results from deleting the i-th row and the j-th column of A. Let d_{ij}=\det M_{ij}.
If (i,j) \ne (n,1), then, since both the first column and the last row of A are zero, either the first column or the last row or both the first column and the last row of M_{ij} is zero and hence d_{ij}=0 in this case. But if (i,j)=(n,1), then M_{ij} is just an upper triangular matrix with the diagonal entries a_{12}, a_{23}, \cdots , a_{n-1,n} and hence d_{n1}=a_{12}a_{21} \cdots a_{n-1,n}. Thus

\text{adj}(A)=[(-1)^{i+j}d_{ij}]^T=(-1)^{n+1}d_{n1}e_{1n}=(-1)^{n+1}a_{12}a_{23} \cdots a_{n-1,n}e_{1n}. \ \Box

Problem 2. Let A=[a_{ij}] be a strictly triangular n \times n matrix with the adjugate \text{adj}(A). Show that if n > 2 and \text{adj}(A)=cA for some scalar c \ne 0, then A=0. Also, show that the result does not always hold if n=2.

Solution. I’ll assume that A is strictly upper triangular; the argument for strictly lower triangular is similar. So we are given that cA=\text{adj}(A)= (-1)^{n+1}a_{12}a_{23} \cdots a_{n-1,n}e_{1n}, by Problem 1, and hence a_{ij}=0 for all (i,j) \ne (1,n) and ca_{1n}=(-1)^{n+1}a_{12}a_{23} \cdots a_{n-1,n}. Thus if n > 2, then, since a_{23}=0, we have a_{12}a_{23} \cdots a_{n-1,n}=0 and hence ca_{1n}=0 implying a_{1n}=0. So a_{ij}=0 for all i,j, and hence A=0.
The result does not always hold if n=2 because, for example, \text{adj}(e_{12})=-e_{12} but e_{12} \ne 0. \ \Box

Remark. In the solution of the next problem, we are going to use this important fact that if A,B \in M_n(\mathbb{C}) and AB=BA, then A,B are simultaneously triangularizable, i.e. there exists an invertible matrix P \in M_n(\mathbb{C}) and upper triangular matrices U,V \in M_n(\mathbb{C}) such that A=PUP^{-1} and B=PVP^{-1}.

Problem 3. Let A be an n \times n, \ n > 2, matrix and suppose that there exists a scalar c \ne 0 such that

A^2=(\text{adj}(A))^2=\text{adj}(A)+cA=0.

Show that A=0.

Solution. It follows from the relations given in the problem that \text{adj}(A)A=A\text{adj}(A)=0 and hence, by the above Remark, there exists an invertible matrix P and upper triangular matrices U,V such that A=PUP^{-1} and \text{adj}(A)=PVP^{-1}. Since A^2=0, the eigenvalues of A are all zero and hence U is strictly upper triangular. Also

\text{adj}(A)=PVP^{-1}=\text{adj}(PUP^{-1})=\text{adj}(P)^{-1} \text{adj}(U) \text{adj}(P)=P \text{adj}(U)P^{-1}

and thus V=\text{adj}(U), which gives

0=\text{adj}(A)+cA=P(V+cU)P^{-1}=P(\text{adj}(U)+cU)P^{-1}.

Hence \text{adj}(U)+cU=0 and therefore, by Problem 2, U=0 giving A=PUP^{-1}=0. \ \Box

Let’s begin this post with a well-known result about the normality of subgroups of prime index.

Problem 1. Let G be a finite group and let p be the smallest prime divisor of |G|. Suppose that G has a subgroup H such that [G:H]=p. Show that H is normal in G.

Solution. See Problem 2 in this post. \Box

A trivial consequence of Problem 1 is that in finite groups, every subgroup of index 2 is normal. In fact, subgroups of index 2 are always normal, whether the group is finite or infinite. Let’s prove this little result.

Problem 2. Let G be a group and suppose that H is a subgroup of G such that [G:H]=2. Show that H is normal in G.

Solution. Choose x \in G \setminus H. Since [G:H]=2, we have G=H \cup xH. Now, suppose, to the contrary, that H is not normal. So there exist g \in G, \ h \in H such that ghg^{-1} \notin H. So ghg^{-1} \in xH and g \in xH. Therefore there exist h_1,h_2 \in H such that g=xh_1 and ghg^{-1}=xh_2. But then, eliminating x, we get g=h_2^{-1}h_1h \in H, contradiction. \Box

Are subgroups of prime index always normal? Of course not. For example, let p > 2 be any prime number, and let G be the dihedral group of order 2p, i.e. G is generated by two elements a,b, such that o(a)=o(ab)=2, \ o(b)=p. Then the subgroup H:=\langle a \rangle =\{1,a\} has index p but H is not normal because bab^{-1} \notin H. On the other hand, the subgroup \langle b \rangle has index 2 and so it’s normal, by Problem 2 or directly: ab^ka^{-1}=(aba^{-1})^k=(aba)^k=b^{-k} \in \langle b \rangle.

Problem 3 (T. Y. Lam). Let G be a group, and let p be a prime number. Suppose that G has a subgroup H such that [G:H]=p. Show that the following statements are equivalent.

i) H is normal in G.

ii) g^p \in H for all g \in G.

iii) If g \in G \setminus H, then g^k \notin H for all positive integers k < p.

Solution. i) \Longrightarrow ii). Since H is normal and [G:H]=p, the set G/H is a group of order p and hence x^p=1_{G/H}=H for all x \in G/H, i.e. g^pH=(gH)^p=H for all g \in G and so g^p \in H.

ii) \Longrightarrow iii). Let g \in G \setminus H and suppose, to the contrary, that g^k \in H for some integer 1 \le k < p. Then, since p is prime, \gcd(k,p)=1 and hence there exist integers r,s such that rk+sp=1. But then we get g=g^{rk+sp}=(g^k)^r(g^p)^s \in H, contradiction.

iii) \Longrightarrow i). Suppose, to the contrary, that H is not normal. Hence there exist x \in G and h \in H such that g:=xhx^{-1} \notin H. Thus, by iii), g^k \notin H for all integers 1 \le k < p, and hence we have p distinct cosets H, gH, \cdots , g^{p-1}H. Thus, since [G:H]=p, we get G=\bigcup_{k=0}^{p-1}g^kH. So x \in \bigcup_{k=0}^{p-1}g^kH and hence x=g^kh_1 for some h_1 \in H and some integer 0 \le k \le p-1. Therefore

x=g^kh_1=(xhx^{-1})^kh_1=xh^kx^{-1}h_1,

which gives x=h_1h^k \in H and hence g=xhx^{-1} \in H, contradiction. \Box

Remark 1. Note that, in the solution of Problem 3, we used the hypothesis “p is prime” only once and that was to prove that ii) \Longrightarrow iii).

Let’s finish this post with a nice but, surprisingly, not well-known result about the normality of subgroups of prime index.

Problem 4 (J.C. Bioch & R.W. van der Waall). Let p be a prime number. Show that if G is a finite group and H is a subgroup of G such that [G:H]=p and \gcd(|H|,p-1)=1, then H is normal in G.

Solution. Proof by induction over |G|. There’s nothing to prove if |G|=p. Now suppose that G is a finite group that satisfies the conditions given in the problem, i.e. G has a subgroup H such that [G:H]=p and \gcd(|H|,p-1)=1. By Problem 1 in this post, there exists a normal subgroup N of G such that N \subseteq H and [G:N] \mid p!. Now, we consider two cases.

Case 1: |N| > 1. In this case, |G/N| < |G| and clearly [G/N:H/N]=[G:H]=p. Also, since |H/N| divides |H| and \gcd(|H|,p-1)=1, we also have \gcd(|H/N|, p-1)=1. Thus, by our induction hypothesis, H/N is normal in G/N and hence H is normal in G.

Case 2: |N|=1. In this case, |G|=[G:N] \mid p! and thus, since p \mid |G|, we get that |G|=pm for some integer m with \gcd(m,p)=1. Since [G:H]=p, we have |H|=m and also

\gcd(|G|,p-1)=\gcd(pm,p-1)=\gcd(m,p-1)=\gcd(|H|,p-1)=1. \ \ \ \ \ \ \ \ \ \ \ (1)

Now let P be the Sylow p-subgroup of G. So |P|=p. Let C(P), N(P) be, respectively, the centralizer and the normalizer of P in G. By the lemma in this post, N(P)/C(P) is isomorphic to a subgroup of \text{Aut}(P). But since P \cong \mathbb{Z}_p, we have |\text{Aut}(P)|=p-1 and so |N(P)/C(P)| must divide p-1. Thus, since |N(P)/C(P)| also divides |G|, we get |N(P)/C(P)|=1, by (1). So N(P)=C(P) and hence, by Burnside’s Normal Complement Theorem, there exists a normal subgroup Q of G such that P \cap Q=\{1\} and G=PQ. Thus |Q|=m=|H| and so

\displaystyle |HQ|=\frac{|H| \cdot |Q|}{|H \cap Q|}=\frac{m^2}{|H \cap Q|}. \ \ \ \ \ \ \ \ \ \ \ \ (2)

Since Q is normal, HQ is a subgroup of G and thus, by \displaystyle (2), \ \frac{m^2}{|H \cap Q|} must divide |G|=pm and that is possible only if |H \cap Q|=m. Thus |H \cap Q|=|H| and hence H \cap Q=H, which gives H=Q because |H|=|Q|. So H is normal in G and the solution is complete. \Box

Remark 2. Note that Problem 1 follows from Problem 4.

Problem. Let V be a vector space over a field of characteristic \neq 2. Let f,g: V \to V be two linear transformations such that f^2=g^2=I, where I : V \to V is the identity map. Show that

\ker(fg-fg)=\ker(f+g) \oplus \ker(f-g).

Solution. We need to prove three things.

i) \ker(f+g) \cap \ker (f-g)=(0).

Proof. If x \in \ker(f+g) \cap \ker (f-g), then

2f(x)=(f+g)(x)+(f-g)(x)=0

and so f(x)=0 hence x=f^2(x)=0.

ii) \ker(f+g) + \ker(f-g) \subseteq \ker(fg-gf).

Proof. We have

fg-gf=(f-g)(f+g)=-(f+g)(f-g), \ \ \ \ \ \ \ \ \ \ \ (1)

which gives \ker(f+g) \subseteq \ker(fg-gf) and \ker(f-g) \subseteq \ker(fg-gf).

iii) \ker(fg-gf) \subseteq \ker(f+g) + \ker(f-g).

Proof. Let x \in \ker(fg-gf). So fg(x)=gf(x) and hence gfg(x)=f(x), \ fgf(x)=g(x), which give

(fg-gf)g(x)=(fg-gf)f(x)=0. \ \ \ \ \ \ \ \ \ \ \ (2)

We also have

\displaystyle x=y+z, \ \ \ \ \ \ \ y:=\frac{1}{2}(f-g)f(x), \ \ \ \ z:=\frac{1}{2}(f+g)g(x)

and, by (1),(2),

(f+g)(y)=(f-g)(z)=0,

i.e. y \in \ker(f+g) and z \in \ker(f-g). Hence x \in \ker(f+g) + \ker(f-g). \ \Box

Recall that an element of a group is called torsion if it has a finite order and a group is called torsion if every element of the group is torsion. A (non-trivial) group is called torsion-free if the only torsion element of the group is the identity element. Clearly every finite group is torsion and every torsion-free group is infinite.

Problem. Let G be a torsion-free group and suppose that G has a cyclic subgroup of finite index. Show that G is cyclic.

Solution. The heart of the solution is the following theorem.

Theorem (Schur). Let G be a group with the center Z(G) and the commutator subgroup G'. If G/Z(G) is finite, then G' is finite too.

Proof. See this post!

Corollary. Let G be a torsion-free group. If G/Z(G) is finite, then G is abelian.

Proof. By Schur’s theorem, G' is finite and so torsion. But G is torsion-free and so G'=(1), i.e. G is abelian.

Now let’s solve the problem. Let H be a cyclic subgroup of finite index in G. It’s clear that H is infinite because G is torsion-free, hence infinite, and [G:H] is finite. I give the solution in several steps.

Claim 1. We may (and will) assume that H is normal in G. Also, H \cong \mathbb{Z}.

Proof. Since H is cyclic and infinite, H \cong \mathbb{Z}. Also, by Cayley’s theorem (see Problem 1 in this post!), H contains a subgroup which is both normal and finite index in G. So, by replacing H with that normal subgroup, we may assume that H is normal in G.

Claim 2. If G is abelian, then G is cyclic.

Proof. Let [G:H]=n. Then the map f: G \to H defined by f(x)=x^n is a group homomorphism and, since G is torsion-free, f is injective. Thus G is isomorphic to a subgroup of H and hence it is cyclic because H is cyclic.

Claim 3. Let C be the centralizer of H in G. Then C is cyclic.

Proof. Let Z(C) be the center of C. Clearly H \subseteq Z(C) \subseteq C \subseteq G and hence C/Z(C) is finite (because we’re assuming that latex [G:H]$ is finite). Thus by the above Corollary, C is abelian and hence cyclic, by Claim 2.

Claim 4. If G has a cyclic subgroup of index 2, then G is cyclic.

Proof. Let $K=\langle a \rangle$ be a cyclic subgroup of G with [G:K]=2. Then K is normal in G and choosing x \in G \setminus K, we have G=K \cup Kx. Now, x^2 \in K and so x^2=a^p for some integer p. We also have xax^{-1}=a^q for some integer q. Thus

a^p=x^2=xx^2x^{-1}=xa^px^{-1}=(xax^{-1})^p=a^{pq}

and so p=pq because G is torsion-free. Thus either p=0 or q=1. If p=0, then x^2=1, which gives x=1 \in K, contradiction. Hence q=1 and therefore xa=ax, i.e. G is abelian hence cyclic, by Claim 2.

Claim 5. G is cyclic.

Proof. Let C be the centralizer of H in G. Define the map f: G \to \text{Aut}(H) by f(x)(h)=xhx^{-1} (don’t forget that, by Claim 1, we’re assuming that H is normal in G). It’s clear that f is a well-defined group homomorphism and \ker f=C. So G/C is isomorphic to a subgroup of \text{Aut}(H) \cong \mathbb{Z}_2. Thus either G=C or [G:C]=2. If G=C, then G is cyclic, by Claim 3. If [G:C]=2, then again G is cyclic, by Claim 3 and Claim 4. The proof of the claim and the solution is now complete. \Box

Problem (V. Brayman). Find all functions f: \mathbb{R} \to M_3(\mathbb{R}) that satisfy the following functional equation

\displaystyle f(x)f(y)f(z)=\begin{pmatrix}x & 0 & 0 \\ 0 & y & 0 \\ 0 & 0 & z\end{pmatrix}, \ \ \ \ \ \ \ \forall x,y,z \in \mathbb{R}. \ \ \ \ \ \ \ \ \ \ \ (*)

Solution. I show that the answer is all functions in the following form

\displaystyle f(x)=\begin{pmatrix}0 & 0 & ax \\ b & 0 & 0 \\ 0 & c & 0\end{pmatrix}, \ \ \ \ \ \ \ a,b,c \in \mathbb{R}, \ \ abc=1.

First see that such functions satisfy (*). Now suppose that f satisfies (*) and let

\displaystyle f(1)=\begin{pmatrix}a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33}\end{pmatrix}.

By (*), we have (f(1))^3=I, the identity matrix, and so

\displaystyle f(x)=(f(1))^3f(x)=f(1)f(1)f(1)f(x)=f(1)\begin{pmatrix}1& 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & x \end{pmatrix},

and

\displaystyle f(x)=f(x)(f(1))^3=f(x)f(1)f(1)f(1)=\begin{pmatrix}x & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}f(1).

Thus

\displaystyle f(x)=f(1)\begin{pmatrix}1& 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & x \end{pmatrix}=\begin{pmatrix}x & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix}f(1),

which gives a_{11}=a_{12}=a_{23}=a_{33}=0 and

\displaystyle f(x)=\begin{pmatrix}0 & 0 & a_{13}x \\ a_{21} & a_{22} & 0 \\ a_{31} & a_{32} & 0\end{pmatrix}.

Put the above in (f(x))^3=xI, which holds by (*), to get

a_{13}a_{32}a_{21}=1, \ \ \ a_{22}=a_{31}=0,

and that completes the solution. \Box

Recall that in a ring R an element z \in R is said to be a left zero divisor if zr=0 for some non-zero element r \in R. Similarly, z \in R is a right zero divisor if rz=0 for some non-zero element r \in R. A zero divisor of R is an element which is both a left and a right zero divisor.

Proposition 1. Let R be a ring, and let Z be the set of left zero-divisors of R. If Z is finite and |Z| > 1, then R is finite and |R| \le |Z|^2.

Proof. Since |Z| > 1, there exists z \in Z such that z \ne 0. Let

I:=\{r \in R: \ \ rz=0\}.

If r \in I, then rz=0 and so r \in Z because z \ne 0. Thus I \subseteq Z and hence

|I| \le |Z|. \ \ \ \ \ \ \ \ \ \ \ (1)

Now clearly I is a subgroup of the abelian group (R,+) and so we may consider the group R/I and the map

f: R/I \longrightarrow Z, \ \ \ \ \ f(r+I)=rz, \ \ \ \forall r \in R.

Then f is well-defined and one-to-one because clearly rz \in Z for all r \in R and if r_1,r_2 \in R, then f(r_1+I)=f(r_2+I) if and only if r_1z=r_2z=0 if and only if (r_1-r_2)z=0 if and only if r_1-r_2 \in I if and only if r_1+I=r_2+I. Thus

|R/I| \le |Z|. \ \ \ \ \ \ \ \ \ \ (2)

It now follows from (1),(2) that

|R|=|I| \cdot |R/I|  \le |Z|^2. \ \Box

Example 1. The equality in the above proposition is possible, i.e. we can have |R|=|Z|^2. To see that, consider the commutative ring R:=\mathbb{Z}/p^2\mathbb{Z}, where p is a prime. It’s easy to see that the set of zero-divisors of R is \{np+p^2\mathbb{Z}: \ \ n=0,1, \cdots , p-1\}. So R has exactly p zero divisors and clearly |R|=p^2.

Example 2. An infinite field has only one zero-divisor, which is 0. That means the condition |Z| > 1 in the above proposition cannot be omitted.

Proposition 2. Let R be a finite ring with 1. Let Z, U be, respectively, the set of left zero divisors and the set of units of R. Then

i) U=R \setminus Z

ii) if R is not a division ring, then |U| \le |R|-\sqrt{|R|}.

Proof. i) Let r \in R \setminus Z. Then the map f: R \longrightarrow rR defined by f(x)=rx is one-to-one and hence R=rR because R is finite and rR \subseteq R. Thus rx=1 for some x \in R and hence r(xr-1)=0, which gives xr=1. So we have shown that R \setminus Z \subseteq U. Thus U=R \setminus Z because obviously Z \cap U=\emptyset.

ii) Since R is not a division ring, U(R) \ne R \setminus \{0\} and so Z contains a non-zero element. Therefore, by Proposition 1, |R| \le |Z|^2 and hence |Z| \ge \sqrt{|R|}. Thus, by i), |U|=|R|-|Z| \le |R|-\sqrt{|R|}. \ \Box

Remark. The results given in Proposition 1 and Proposition 2, i), still hold if we replace “left zero divisors” with “right zero-divisors”.

Example 3. Let n > 1 be an integer which is not prime. Let \phi be the Euler’s totient function. We now use Proposition 2 to prove that \phi(n) \le n-\sqrt{n}, which is a well-known result in number theory. To do that, we consider the commutative ring R:=\mathbb{Z}/n\mathbb{Z}. Let U be the set of units of R. It’s clear that |U|=\phi(n). Thus, since R is not a field (because n is not prime), we have from Proposition 2, ii), that

\phi(n)=|U| \le |R|-\sqrt{|R|}=n-\sqrt{n}.

Example 4. Using Proposition 2, we show that for any prime p and integers m \ge 1, n > 1,

\displaystyle \prod_{i=1}^n(p^{mn}-p^{(i-1)m}) \le p^{mn^2}-p^{mn^2/2}.

To prove the above, let F be the finite field of order p^m and consider the ring R:=M_n(F), the ring of n \times n matrices with entries from F. Let U be the set of units of R. It’s clear that |R|=|F|^{n^2}=p^{mn^2} and also, as we proved in Problem 3 in this post, \displaystyle |U|=\prod_{i=1}^n(|F|^n-|F|^{i-1})=\prod_{i=1}^n(p^{mn}-p^{(i-1)m}). So, since R is not a division ring (because n > 1), the result follows from Proposition 2, ii).

See the first part of this post here and the second part here.

Example 1. By Example 1 in the first part of this post, G acts on itself by left multiplication, i.e. g_1 \cdot g_2=g_1g_2, for all g_1,g_2 \in G, where g_1g_2 is the usual multiplication in G. Now if x \in G, then

\text{orb}(x)=\{g \cdot x: \ \ g \in G\}=\{gx: \ \ g \in G\}=G

and

\text{stab}(x)=\{g \in G: \ \ g \cdot x=x\}=\{g \in G: \ \ gx=x\}=(1).

So if G is finite, then \displaystyle |G|=|\text{orb}(x)|=\frac{|G|}{|\text{stab}(x)|}, which is the orbit-stabilizer theorem (see the fourth part of the theorem in the second part of this post).

Example 2. Let H be a normal subgroup of G. By Example 2 in the first part of this post, G acts on H by conjugation, i.e. g \cdot h=ghg^{-1} for all g \in G, \ h \in H. Now let h \in H and let \text{Cl}(h), \ \text{C}(h) be the conjugacy class and the centralizer of h in G, respectively. Then

\text{orb}(h)=\{g \cdot h: \ \ g \in G\}=\{ghg^{-1}: \ \ g \in G\}=\text{Cl}(h)

and

\begin{aligned}\text{stab}(h)=\{g \in G: \ \ g \cdot h=h\}=\{g \in G: \ \ ghg^{-1}=h\}=\{g \in G: \ \ gh=hg\}=C(h).\end{aligned}

Now suppose that G is finite. Then the orbit-stabilizer theorem gives \displaystyle |\text{Cl}(h)|=\frac{|G|}{|C(h)|}. It is clear that |\text{orb}(h)|=|\text{Cl}(h)|=1 if and only if |G|=|C(h)| if and only if G=C(h) if and only if h \in Z(G), the center of G. Now, by the second part of the theorem in the second part of this post, \{\text{orb}(h): \ \ \ h \in H\} is a partition of H. So if \text{orb}(h_1), \cdots , \text{orb}(h_n), where h_1, \cdots , h_k \in Z(G), are all the distinct orbits, then

\displaystyle \begin{aligned}|H|=\sum_{j=1}^n|\text{orb}(h_j)|=\sum_{j=1}^n|\text{Cl}(h_j)|=\sum_{j=1}^n\frac{|G|}{|C(h_j)|}=\sum_{j=1}^k\frac{|G|}{|C(h_j)|}+\sum_{j=k+1}^n\frac{|G|}{|C(h_j)|}\end{aligned}

\displaystyle =k+\sum_{j=k+1}^n\frac{|G|}{|C(h_j)|}=|H \cap Z(G)|+\sum_{j=k+1}^n\frac{|G|}{|C(h_j)|},

which, as you know, is known as the class equation.

Example 3. Let X be the set of all subgroups of G. Then, by Example 4, i), in the first part of this post, G acts on X by conjugation. Let H \in X and let \text{Cl}(H) be the set of all subgroups of G which are conjugate to H. Let N(H) be the normalizer of H in G. Then

\text{orb}(H)=\{g \cdot H: \ \ g \in G\}=\{gHg^{-1}: \ \ g \in G\}=\text{Cl}(H)

and

\text{stab}(H)=\{g \in G: \ \ g \cdot H=H\}=\{g \in G: \ \ gHg^{-1}=H\}=N(H).

Now suppose that G is finite. Then the orbit-stabilizer theorem gives \displaystyle |\text{Cl}(H)|=\frac{|G|}{|N(H)|}. It is clear that |\text{orb}(H)|=1 if and only if |G|=|N(H)| if and only if G=N(H) if and only if H is normal in G. Now, by the second part of the theorem in the second part of this post, \{\text{orb}(H): \ \ \ H \in X\} is a partition of X. So if \text{orb}(H_1), \cdots , \text{orb}(H_n), where \{H_1, \cdots , H_k\} is the set of normal subgroups of G, are all the distinct orbits, then

\displaystyle \begin{aligned}|X|=\sum_{j=1}^n|\text{orb}(H_j)|=\sum_{j=1}^n|\text{Cl}(H_j)|=\sum_{j=1}^n\frac{|G|}{|N(H_j)|}=\sum_{j=1}^k\frac{|G|}{|N(H_j)|}+\sum_{j=k+1}^n\frac{|G|}{|N(H_j)|}\end{aligned}

\displaystyle =k+\sum_{j=k+1}^n\frac{|G|}{|N(H_j)|}.

Example 4. Let p be a prime dividing |G| and let X:=\{P_1, \cdots , P_m\} be the set of all Sylow p-subgroups of G. By Example 4, ii), in the first part of this post, G acts on X by conjugation. Now, we know that every two Sylow p-subgroups are conjugate. Thus \text{orb}(P_1)=X and clearly \text{stab}(P_1)=N(P_1), the normalizer of P_1 in G. Thus, by the orbit-stabilizer theorem,

\displaystyle m=|X|=\text{orb}(P_1)=\frac{|G|}{|N(P_1)|}.

Example 5. Let H,K be two subgroups of G. By Example 9 in the first part of this post, the group H \times K acts on HK where the action is defined by (h_1,k_1) \cdot (hk)=h_1hkk_1^{-1} for all h,h_1 \in H, \ k,k_1 \in K. Now

\text{orb}(1)=\{(h_1,k_1) \cdot 1: \ \ h_1 \in H, k_1 \in K\}=\{h_1k_1^{-1}: \ \ h_1 \in H, k_1 \in K\}=HK

and

\begin{aligned}\text{stab}(1)=\{(h_1,k_1) \in H \times K: \ \ (h_1,k_1) \cdot 1=1\}=\{(h_1,k_1) \in H \times K: \ \ h_1k_1^{-1}=1\}\end{aligned}

=\{(h_1,k_1) \in H \times K: \ \ h_1=k_1\}=\{(h,h): \ \ h \in H \cap K\}.

So, if H,K are both finite, then |\text{orb}(1)|=|HK| and |\text{stab}(1)|=|H \cap K| and hence, by the orbit-stabilizer theorem,

\displaystyle |HK|=\frac{|H \times K|}{|H \cap K|}=\frac{|H| |K|}{|H \cap K|},

which is a well-known result.

For the first part of this post, see here.

Throughout this post, X is a set, G is a group with identity element 1, and G acts on X.

We now show that the action of G on X partitions X into subsets called orbits and if G is finite, then the number of elements of each orbit is finite and it divides the order of G. But first a couple of definitions.

Definition. Let x \in X. The orbit of x, denoted \text{orb}(x) or G \cdot x, is defined by

\text{orb}(x)=G \cdot x:=\{g \cdot x: \ \ \ g \in G\}

and the stabilizer of x, denoted \text{stab}(x) or G_x, is defined as follows

\text{stab}(x)=G_x:=\{g \in G: \ \ \ g \cdot x=x\}.

So, for any x \in X, \ \text{orb}(x) is a subset of X but \text{stab}(x) is a subset of G. In fact, \text{stab}(x) is a subgroup of G, as the third part of the following theorem shows.

Theorem. i) If x_1,x_2 \in X and \text{orb}(x_1) \cap \text{orb}(x_2) \neq \emptyset, then \text{orb}(x_1) = \text{orb}(x_2).

ii) The set \{\text{orb}(x): \ \ x \in X\} is a partition of X. In particular, if X is finite and \text{orb}(x_1), \cdots , \text{orb}(x_n) are all the distinct orbits, then \displaystyle|X|=\sum_{j=1}^n|\text{orb}(x_j)|.

iii) If x \in X, then \text{stab}(x) is a subgroup of G.

iv) (The Orbit-Stabilizer Theorem) Let x \in X and put

H:=\text{stab}(x), \ \ \ \ L=\{gH: \ \ g \in G\}.

Then the map \varphi : \text{orb}(x) \to L defined by \varphi(g \cdot x)=gH is bijective. In particular, if |L|=[G:H] is finite, then \text{orb}(x) is finite too and |\text{orb}(x)|=[G:H] and so if G is finite, then \displaystyle |\text{orb}(x)|=\frac{|G|}{|H|}.

Proof. i) Let x \in \text{orb}(x_1) \cap \text{orb}(x_2). Then x=g_1 \cdot x_1=g_2 \cdot x_2 for some g_1,g_2 \in G and so, by the remark in the first part of this post, we have x_1=(g_1^{-1}g_2) \cdot x_2 and hence

g \cdot x_1=g \cdot ((g_1^{-1}g_2) \cdot x_2)=(gg_1^{-1}g_2) \cdot x_2

for all g \in G. So \text{orb}(x_1) \subseteq \text{orb}(x_2) and, by symmetry, we also have \text{orb}(x_2) \subseteq \text{orb}(x_1), hence \text{orb}(x_1) = \text{orb}(x_2).

ii) Since 1 \cdot x=x, for all x \in X, we have x \in \text{orb}(x) for all x \in X, and the result now follows from i).

iii) First notice that since 1 \cdot x=x, we have 1 \in \text{stab}(x) and so \text{stab}(x) \neq \emptyset. Now let g \in \text{stab}(x). Then, by the remark in the first part of this post, g \cdot x=x gives x=g^{-1} \cdot x and so g^{-1} \in \text{stab}(x). Also, if g_1,g_2 \in \text{stab}(x), then

(g_1g_2) \cdot x=g_1 \cdot (g_2 \cdot x)=g_1 \cdot x=x

and so g_1g_2 \in \text{stab}(x), hence the result.

iv) For g_1,g_2 \in G, we have, by the definition of \varphi, that \varphi(g_1 \cdot x)=\varphi(g_2 \cdot x) if and only if g_1H=g_2H if and only if g_1^{-1}g_2 \in H if and only if g_1^{-1}g_2 \cdot x=x if and only if g_1 \cdot x=g_2 \cdot x. So \varphi is a well-defined injective map. Clearly \varphi is surjective and so \varphi is bijective. \Box

Remark. An important result of the orbit-stabilizer theorem, i.e. part iv) of the above theorem, is that if G is finite and x \in X, then |\text{orb}(x)| divides |G|. Also it follows from part ii) and part iv) of the above theorem that if X,G are both finite and \text{orb}(x_1), \cdots , \text{orb}(x_n) are all the distinct orbits, then 

\displaystyle |X|=\sum_{j=1}^n\frac{|G|}{|\text{stab}(x_j)|}.

If x \in X, then \text{stab}(x) is always a subgroup of G, by the third part of the above theorem. But when is \text{stab}(x) normal in G ? The following proposition answers this question.

Proposition. Let x \in X and H:=\text{stab}(x). Then

\text{stab}(g \cdot x)=gHg^{-1}

for all g \in G and thus H is normal in G if and only if H=\text{stab}(y) for all y \in \text{orb}(x).

Proof. Let g \in G. Then g_1 \in \text{stab}(g \cdot x) if and only if g_1 \cdot (g \cdot x)=g \cdot x if and only if (g_1g) \cdot x = g \cdot x if and only if g^{-1} \cdot ((g_1g) \cdot x)=x if and only if (g^{-1}g_1g) \cdot x=x if and only if g^{-1}g_1g \in H if and only if g_1 \in gHg^{-1}. So we have shown that \text{stab}(g \cdot x)=gHg^{-1} for any g \in G. Hence H is normal in G if and only if H=gHg^{-1}=\text{stab}(g \cdot x) for all g \in G. In other words, since \text{orb}(x)=\{g \cdot x: \ \ \ g \in G\}, \ H is normal in G if and only if H=\text{stab}(y) for all y \in \text{orb}(x). \ \Box

Throughout this post, X is a set and G is a group with the identity element 1.

Definition. An action of G on X is any group homomorphism f: G \to \text{Sym}(X), where \text{Sym}(X) is the group of permutations of X. Then we say that G acts on X or X is a G-set.

The following proposition says that G acting on X basically means that we can “multiply” any  element of G by any element of X and get an element of X, of course provided that a couple of conditions are satisfied (just like rings and modules!).

Notation. For the sake of simplicity, from now on, given a map G \times X \to X and g \in G, \ x \in X, we denote the image of (g,x) by g \cdot x.

Proposition. G acts on X if and only if there exists a map G \times X \to X such that

i)   1 \cdot x=x,

ii)   (g_1g_2) \cdot x=g_1 \cdot (g_2 \cdot x),

for all g_1,g_2 \in G, \ x \in X.

Proof. Suppose first that G acts on X. So there exists a group homomorphism f: G \to \text{Sym}(X). Now define a map G \times X \to X by g \cdot x=f(g)(x), for all g \in G, \ x \in X. Then for all x \in X and g_1,g_2 \in G we have 1 \cdot x=f(1)(x)=\text{id}_X(x)=x and

(g_1g_2) \cdot x=f(g_1g_2)(x)=(f(g_1)f(g_2))(x)=f(g_1)(f(g_2)(x))=g_1 \cdot (g_2 \cdot x).

Conversely, suppose that there exists a map G \times X \to X such that

1 \cdot x=x, \ \ \ (g_1g_2) \cdot x=g_1 \cdot (g_2 \cdot x),

for all g_1,g_2 \in G, \ x \in X. Now define the map f: G \to \text{Sym}(X) by f(g)(x)=g \cdot x for all g \in G and x \in X. So we need to prove two things: f is basically well-defined, i.e. f(g) \in \text{Sym}(X) for all g \in G, and f is a group homomorphism.
Let g \in G. We first show that f(g) is injective. Suppose that f(g)(x_1)=f(g)(x_2) for some x_1,x_2 \in X. Then, by the definition of f, we have g \cdot x_1=g \cdot x_2 and so

x_1=1 \cdot x_1=(g^{-1}g) \cdot x_1=g^{-1} \cdot (g \cdot x_1)=g^{-1} \cdot (g \cdot x_2)=(g^{-1}g) \cdot x_2=1 \cdot x_2=x_2.

So f(g) is injective. Now we show that f(g) is surjective. Let x_2 \in X and put x_1=g^{-1} \cdot x_2. Then

f(g)(x_1)=g \cdot x_1=g \cdot (g^{-1} \cdot x_2)=(gg^{-1}) \cdot x_2=1 \cdot x_2=x_2

and so f(g) is surjective, hence f \in \text{Sym}(X), i.e. f is well-defined.
Finally, f is a group homomorphism because for all g_1,g_2 \in G and x \in X we have

f(g_1g_2)(x)=(g_1g_2) \cdot x=g_1 \cdot (g_2 \cdot x)=f(g_1)f(g_2)(x)

and so f(g_1g_2)=f(g_1)f(g_2). \ \Box

Remark. If G \times X \to X is an action of G on X, then g_1 \cdot x_1=g_2 \cdot x_2 for some g_1,g_2 \in G, \ x_1,x_2 \in X if and only if x_1=(g^{-1}g_2) \cdot x_2. That’s because, by the two properties of an action given in the above proposition, if g_1 \cdot x_1=g_2 \cdot x_2, then

x_1=1 \cdot x_1=(g_1^{-1}g_1) \cdot x_1=g_1^{-1} \cdot (g_1 \cdot x_1)=g_1^{-1} \cdot (g_2 \cdot x_2)=(g_1^{-1}g_2) \cdot x_2

and if x_1=(g_1^{-1}g_2) \cdot x_2, then

g_1 \cdot x_1=g_1 \cdot ((g_1^{-1}g_2) \cdot x_2)=(g_1g_1^{-1}g_2) \cdot x_2=g_2 \cdot x_2. \ \Box

Now we use the above proposition to give a few examples of group actions.

Example 1. G acts on itself by (left) multiplication, i.e. the map G \times G \to G defined by g_1 \cdot g_2=g_1g_2, for all g_1,g_2 \in G, where g_1g_2 is just the usual multiplication of elements g_1,g_2 \in G, satisfies the two conditions given in the above proposition. That’s because, by the definition, 1 \cdot g=1g=g and

(g_1g_2) \cdot g=g_1g_2g=g_1 \cdot (g_2g)=g_1 \cdot (g_2 \cdot g)

for all g,g_1,g_2 \in G.

Example 2. Let H be a normal subgroup of G. Then G acts on H by conjugation, i.e. the map G \times H \to H defined by g \cdot h=ghg^{-1} for all g \in G, \ h \in H, satisfies the two conditions given in the above proposition. To see  that, first note that the map is well-defined, i.e. g \cdot h \in H, for all g \in G, \ h \in H, because H is normal. Also, we have 1 \cdot h=h and

(g_1g_2) \cdot h=(g_1g_2)h(g_1g_2)^{-1}=g_1g_2hg_2^{-1}g_1^{-1}=g_1 \cdot (g_2hg_2^{-1})=g_1 \cdot (g_2 \cdot h),

for all g_1,g_2 \in G, \ h \in H.

Example 3. Let H, K be subgroups of G and let X:=\{aH: \ \ a \in G\} be the set of left cosets of H in G. Then K clearly acts on X by left multiplication, i.e. the map K \times X \to X defined by g \cdot aH=gaH, for all g \in K, \ a \in G, satisfies the two conditions given in the above proposition.

Example 4. i) Let X be the set of all subgroups of G. Then G acts on X by conjugation, i.e. the map G \times X \to X defined by g \cdot H=gHg^{-1}, for all g \in G, \ H \in X, is an action of G on X.

ii) Let n \ge 1 be a given integer and let X be the set of all subgroups of order n in G. Then G acts on X by conjugation.

Example 5. Let H be a characteristic subgroup of G. Let \text{Aut}(G) be the group of automorphisms of G. Then the map \text{Aut}(G) \times H \to H defined by \sigma \cdot h=\sigma(h), where \sigma \in \text{Aut}(G), \ h \in H, is clearly an action of \text{Aut}(G) on H. Note that the action is well-defined since H is characteristic.

Example 6. Let n be a positive integer, k a field, X the set of n \times 1 matrices with entries from k, and G the multiplicative group of n \times n invertible matrices with entries from k. Then G acts on X by left multiplication, i.e. the map G \times X \to X defined by g \cdot x=gx for all g \in G, \ x \in X, where gx is the usual matrix multiplication, satisfies the two conditions given in the above proposition.

Example 7. Let n be a positive integer, k a field, X the set of n \times 1 matrices with entries from k, and G:=S_n, the permutation group of \{1,2, \cdots ,n\}. Then G acts on X by permutation, i.e.

\sigma \cdot [x_1, \cdots , x_n]^T=[x_{\sigma(1)}, \cdots, x_{\sigma(n)}]^T,

for all \sigma \in G, \ x_i \in k, defines an action of G on X.

Example 8. i) If G acts on X_1,X_2, then G also acts on X_1 \times X_2 if we define g \cdot (x_1,x_2)=(g \cdot x_1, g \cdot x_2) for all g \in G, \ x_1 \in X_1, \ x_2 \in X_2.

ii) If G acts on X, then G also acts on X':=\{(x,y): X \times X: \ \ x \ne y\} if we define

g \cdot (x,y)=(g \cdot x, g \cdot y)

for all g \in G, \ (x,y) \in X'. Note that the action is well-defined since g \cdot x=g \cdot y if and only if x=y (see the above Remark!).

Example 9. Let H,K be subgroups of G. The group H \times K acts on HK=\{hk: \ \ h \in H, \ k \in K\} where the action is defined by (h_1,k_1) \cdot (hk)=h_1hkk_1^{-1} for all h,h_1 \in H, \ k,k_1 \in K. To see that, we need to prove that the two conditions in the above proposition hold. Well, the first condition obviously holds because, by definition, (1,1) \cdot (hk)=hk. We also have

(h_1h_2,k_1k_2) \cdot (hk)=h_1h_2hk(k_1k_2)^{-1}=h_1h_2hkk_2^{-1}k_1^{-1}

and 

(h_1,k_1) \cdot ((h_2,k_2) \cdot (hk))=(h_1,k_1) \cdot (h_2hkk_2^{-1})=h_1h_2hkk_2^{-1}k_1^{-1}.

So (h_1h_2,k_1k_2) \cdot (hk)=(h_1,k_1) \cdot ((h_2,k_2) \cdot (hk)) for all h,h_1,h_2 \in H, \ k,k_1,k_2 \in K and that proves of the second condition.

Problem (Marian Andronache). For any finite group G and positive integer k, let

P_k(G):=\{x^k: \ \ x \in G\}.

Find all finite groups G of order \ge 4 that satisfy the following condition

|P_1(G)| > |P_2(G)| > \cdots > |P_{\lfloor |G|/2\rfloor}(G)|,

where \lfloor |G|/2 \rfloor is the integer part of |G|/2.

Solution. There are only three finite groups that satisfy the condition: \mathbb{Z}_4, \ \mathbb{Z}_2 \times \mathbb{Z}_2, and \mathbb{Z}_6. It’s easy to see that the groups \mathbb{Z}_4, \ \mathbb{Z}_2 \times \mathbb{Z}_2 satisfy the condition. So we may assume that |G| \ge 5. But before we continue with the solution, we need the following simple result.

Claim. Let G be a group of order n. If k \ge 1 is an integer and \gcd(k,n)=1, then P_k(G)=P_1(G).

Proof. Since \gcd(k,n)=1, there exist integers p,q such that np+kq=1 and so for any x \in G, we have

x=x^{np+kq}=x^{kq}=(x^q)^k \in P_k(G)

and thus G=P_1(G) \subseteq P_k(G) \subseteq G, which completes the proof.

Back to the problem, let G be a group of order n \ge 5 that satisfies the condition in the problem. If n is odd, then \gcd(2,n)=1 and hence P_1(G)=P_2(G), by the above claim, and that contradicts the condition |P_1(G)| > |P_2(G)|. So n is an even number \ge 6. Let n=2^km, where m is odd and k \ge 1. If k \ge 2, then \gcd(2^{k-1}m-1,n)=1, and hence, by the above claim, P_{2^{k-1}m-1}(G)=P_1(G), contradiction again. So k=1 and n=2m. If m > 3, then \gcd(m-2,n)=1 and so, by the above claim, P_{m-2}(G)=P_1(G), contradiction. So m=3 and hence n=6. There are only two groups of order 6 : \mathbb{Z}_6 and S_3. It’s easy to see that \mathbb{Z}_6 satisfies the condition and S_3 does not satisfy the condition. \Box