Definition. For an integer n \ge 1, the n\times n Hilbert matrix is defined by H_n=[a_{ij}], where

\displaystyle a_{ij}=\frac{1}{i+j-1}, \ \  1 \le i,j \le n.

It is known that H_n is invertible and if H_n^{-1}=[b_{ij}], then \displaystyle \sum_{i,j}b_{ij}=n^2. We are going to use these two properties of Hilbert matrices to solve the following calculus problem.

Problem. Let n \ge 1 be an integer and let f : [0,1] \longrightarrow \mathbb{R} be a continuous function. Suppose that \displaystyle \int_0^1 x^kf(x) \ dx = 1 for all  0 \le k \le n-1. Show that \displaystyle \int_0^1 (f(x))^2 dx \ge n^2.

Solution. Since H_n, the n\times n Hilbert matrix, is invertible, there exist real numbers p_0, p_1, \cdots , p_{n-1} such that

\displaystyle \sum_{i=1}^n\frac{p_{i-1}}{i+j-1}=1, \ \ \ 1 \le j \le n.

So the polynomial \displaystyle p(x)=\sum_{k=0}^{n-1}p_kx^k satisfies the conditions

\displaystyle \int_0^1x^k p(x) \ dx =1, \ \ \ 0 \le k \le n-1.

Clearly \displaystyle \sum_{k=0}^{n-1}p_k is the sum of all the entries of H_n^{-1} and so \displaystyle \sum_{k=0}^{n-1}p_k=n^2. Now let f be a real-valued continuous function on [0,1] such that

\displaystyle \int_0^1x^kf(x) \ dx  = 1, \ \ \ 0 \le k \le n-1.

Let p(x) be the above polynomial.Then since

\displaystyle (f(x))^2-2f(x)p(x)+(p(x))^2 =(f(x)-p(x))^2 \ge 0,

integrating gives

\displaystyle \begin{aligned} \int_0^1 (f(x))^2dx \ge 2\int_0^1f(x)p(x) \ dx -\int_0^1(p(x))^2dx=2\sum_{k=0}^{n-1}p_k \int_0^1 x^kf(x) \ dx - \\ \sum_{k=0}^{n-1}p_k\int_0^1x^kp(x) \ dx = 2\sum_{k=0}^{n-1}p_k-\sum_{k=0}^{n-1}p_k=\sum_{k=0}^{n-1}p_k =n^2. \ \Box \end{aligned}


Problem. Let A \in M_3(\mathbb{R}) be orthogonal and suppose that \det(A)=-1. Find \det(A-I).

Solution. Since A is orthogonal, its eigenvalues have absolute value 1 and it can be be diagonalized. Let D be a diagonal matrix such that PDP^{-1}=A for some invertible matrix P \in M_3(\mathbb{C}). Then

\det(D)=\det(A)=-1, \ \ \det(D-I)=\det(A-I).

We claim that the eigenvalues of A are \{-1,e^{i\theta},e^{-i\theta}\} for some \theta. Well, the characteristic polynomial of A has degree three and so it has either three real roots or only one real root. Also, the complex conjugate of a root of a polynomial with real coefficients is also a root. So, since \det(A)=-1, the eigenvalues of A are either all -1, which is the case \theta=\pi, or two of them are 1 and one is -1, which is the case \theta = 0, or one is -1 and the other two are in the form \{e^{i\theta}, e^{-i\theta}\} for some \theta. So

\displaystyle \begin{aligned} \det(D-I)=\det(A-I)=(-2)(e^{i\theta}-1)(e^{-i\theta}-1)=-4(1-\cos \theta). \ \Box \end{aligned}

Note that given \theta, the matrix

\displaystyle A=\begin{pmatrix} -1 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta \\ 0 & \sin \theta & \cos \theta \end{pmatrix}

is orthogonal, \det(A)=-1 and \det(A-I)=-4(1-\cos \theta).


Throughout this post, R is a ring with 1.

Theorem (Jacobson). If x^n=x for some integer n > 1 and all x \in R, then R is commutative.

In fact n, in Jacobson’s theorem, doesn’t have to be fixed and could depend on x, i.e. Jacobson’s theorem states that if for every x \in R there exists an integer n > 1 such that x^n=x, then R is commutative. But we are not going to discuss that here.
In this post, we’re going to prove Jacobson’s theorem. Note that we have already proved the theorem for n=3, 4 (see here and here) and we didn’t need R to have 1, we didn’t need that much ring theory either. But to prove the theorem for any n > 1, we need a little bit more ring theory.

Lemma. If Jacobson’s theorem holds for division rings, then it holds for all rings with 1.

Proof. Let R be a ring with 1 such that x^n=x for some integer n > 1 and all x \in R. Then clearly R is reduced, i.e. R has no non-zero nilpotent element. Let \{P_i: \ i \in I\} be the set of minimal prime ideals of R.
By the structure theorem for reduced rings, R is a subring of the ring \prod_{i\in I}D_i, where D_i=R/P_i is a domain. Clearly x^n=x for all x \in D_i and all i \in I. But then, since each D_i is a domain, we get x=0 or x^{n-1}=1, i.e. each D_i is a division ring. Therefore, by our hypothesis, each D_i is commutative and hence R, which is a subring of \prod_{i\in I}D_i, is commutative too. \Box

Example. Show that if x^5=x for all x \in R, then R is commutative.

Solution. By the lemma, we may assume that R is a division ring.
Then 0=x^5-x=x(x-1)(x+1)(x^2+1) gives x=0,1,-1 or x^2=-1. Suppose that R is not commutative and choose a non-central element x \in R. Then x+1,x-1 are also non-central and so x^2=(x+1)^2=(x-1)^2=-1 which gives 1=0, contradiction! \Box

Remark 1. Let D be a division ring with the center F. If there exist an integer n \ge 1 and a_i \in F such that x^n+a_{n-1}x^{n-1}+ \cdots + a_1x+a_0=0 for all x \in D, then F is a finite field. This is obvious because the polynomial x^n+a_{n-1}x^{n-1}+ \cdots + a_1x+a_0 \in F[x] has only a finite number of roots in F and we have assumed that every element of F is a root of that polynomial.

Remark 2. Let D be a domain and suppose that D is algebraic over some central subfield F. Then D is a division ring and if 0 \ne d \in D, then F[d] is a finite dimensional division F-algebra.

Proof. Let 0 \ne d \in D. So d^m +a_{m-1}d^{m-1}+ \cdots + a_1d+a_0=0 for some integer m \ge 1 and a_i \in F. We may assume that a_0 \ne 0. Then d(d^{m-1} + a_{m-1}d^{m-2}+ \cdots + a_1)(-a_0^{-1})=1 and so d is invertible, i.e. D is a division ring.
Since F[d] is a subring of D, it is a domain and algebraic over F and so it is a division ring by what we just proved. Also, since d^m \in \sum_{i=0}^{m-1} Fd^i for some integer m \ge 1, we have F[d]=\sum_{i=0}^{m-1} Fd^i and so \dim_F F[d] \le m. \ \Box

Proof of the Theorem. By the above lemma, we may assume that R is a division ring.
Let F be the center of R. By Remark 1, F is finite. Since R is a division ring, it is left primitive. Since every element of R is a root of the non-zero polynomial x^n-x \in F[x], \ R is a polynomial identity ring.
Hence, by the Kaplansky-Amtsur theorem, \dim_F R < \infty and so R is finite because F is finite. Thus, by the Wedderburn’s little theorem, R is a field. \Box

The following problem is from the American Mathematical Monthly. The problem only asks the reader to calculate \displaystyle \lim_{n\to\infty} A_n^n, it doesn’t give the answer; I added the answer myself.

Problem (Furdui, Romania). Let a,b,c,d be real numbers with bc > 0. For an integer n \ge 1, let

A_n:=\begin{bmatrix} \cos \left(\frac{a}{n}\right) & \sin \left(\frac{b}{n}\right) \\ \\ \sin \left(\frac{c}{n}\right) & \cos \left(\frac{d}{n}\right) \end{bmatrix}.

Let \text{sgn(x)} be the sign function. Show that

\displaystyle \lim_{n\to\infty} A_n^n = \begin{bmatrix} \cosh(\sqrt{bc}) & \text{sgn}(b)\sqrt{\frac{b}{c}} \sinh(\sqrt{bc}) \\ \\ \text{sgn}(b)\sqrt{\frac{c}{b}} \sinh(\sqrt{bc}) & \cosh(\sqrt{bc}) \end{bmatrix}.

Solution. The characteristic polynomial of A_n is

\displaystyle p_n(x)=x^2-\left(\cos \left(\frac{a}{n}\right)+\cos\left(\frac{d}{n}\right) \right)x+\cos\left(\frac{a}{n}\right)\cos\left(\frac{d}{n}\right)-\sin \left(\frac{b}{n} \right)\sin \left(\frac{c}{n} \right).

And roots of p_n(x) are

\displaystyle r_n=\frac{\cos \left(\frac{a}{n}\right)+\cos \left(\frac{d}{n}\right) + \sqrt{\Delta_n}}{2}, \ \ s_n=\frac{\cos \left(\frac{a}{n}\right) + \cos \left(\frac{d}{n}\right) - \sqrt{\Delta_n}}{2},


\displaystyle \Delta_n:=\left(\cos \left(\frac{a}{n}\right)-\cos \left(\frac{d}{n}\right)\right)^2+4 \sin \left(\frac{b}{n} \right) \sin \left(\frac{c}{n}\right).

If n is sufficiently large, which is the case we are interested in, then, since b,c are either both positive or both negative (because bc > 0), \displaystyle \sin \left(\frac{b}{n} \right) \sin \left(\frac{c}{n}\right) > 0 and so \Delta_n > 0. So, in this case, r_n,s_n are distinct real numbers and hence A_n is diagonalizable in \mathbb{R}.
Now v \in \mathbb{R}^2 is an eigenvector corresponding to r_n if and only if (A_n-r_nI)v=0 if and only if \displaystyle v \in \mathbb{R} \begin{bmatrix} 1 \\ \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix}. Similarly, v \in \mathbb{R}^2 is an eigenvector corresponding to s_n if and only if (A_n-s_nI)v=0 if and only if \displaystyle v \in \mathbb{R} \begin{bmatrix} 1 \\ \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix}. So if

\displaystyle P:=\begin{bmatrix} 1 & 1 \\  \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} & \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix}, \ D:=\begin{bmatrix} r_n & 0 \\ 0 & s_n \end{bmatrix},

then A_n=PDP^{-1} and hence

\displaystyle A_n^n=PD^nP^{-1}=\frac{\sin \left(\frac{b}{n}\right)}{s_n-r_n}\begin{bmatrix} 1 & 1 \\  \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} & \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix} \begin{bmatrix}r_n^n & 0 \\ 0 & s_n^n \end{bmatrix}\begin{bmatrix} \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} & -1 \\  \frac{\cos \left(\frac{a}{n}\right)-r_n}{\sin \left(\frac{b}{n}\right)} & 1 \end{bmatrix}.

The rest of the solution is just Calculus and if you have trouble finding limits, see this post in my Calculus blog for details. We have

\displaystyle  \lim_{n\to\infty} \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)}=-\lim_{n\to\infty} \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)}=\text{sgn}(b)\sqrt{\frac{c}{b}}, \\ \lim_{n\to\infty} \frac{\sin \left(\frac{b}{n}\right)}{r_n-s_n}=\frac{\text{sgn}(b)}{2}\sqrt{\frac{b}{c}}, \lim_{n\to\infty}r_n^n=e^{\sqrt{bc}}, \ \ \lim_{n\to\infty}s_n^n =e^{-\sqrt{bc}}. \ \Box

For a \in \mathbb{C} let \overline{a} denote the complex conjugate of a. Recall that a matrix [a_{ij}] \in M_n(\mathbb{C}) is called Hermitian if a_{ij}=\overline{a_{ji}}, for all 1 \leq i,j \leq n. It is known that if A is Hermitian, then A is diagonalizable and every eigenvalue of A is a real number. In this post, we give a lower bound for the rank of a Hermitian matrix. To find the lower bound, we first need an easy inequality.

Problem 1. Prove that if a_1, \ldots , a_m \in \mathbb{R}, then (a_1 + \ldots + a_m)^2 \leq m(a_1^2 + \ldots + a_m^2).

Solution.  We have a^2+b^2 \geq 2ab for all a,b \in \mathbb{R} and so

(m-1)\sum_{i=1}^m a_i^2=\sum_{1 \leq i < j \leq m}(a_i^2+a_j^2) \geq \sum_{1 \leq i < j \leq m}2a_ia_j.

Adding the term \sum_{i=1}^m a_i^2 to both sides of the above inequality will finish the job. \Box

Problem 2. Prove that if 0 \neq A \in M_n(\mathbb{C}) is Hermitian, then {\rm{rank}}(A) \geq ({\rm{tr}}(A))^2/{\rm{tr}}(A^2).

Solution. Let \lambda_1, \ldots , \lambda_m be the nonzero eigenvalues of A. Since A is diagonalizable, we have {\rm{rank}}(A)=m. We also have {\rm{tr}}(A)=\lambda_1 + \ldots + \lambda_m and {\rm{tr}}(A^2)=\lambda_1^2 + \ldots + \lambda_m^2. Thus, by Problem 1,

({\rm{tr}}(A))^2 \leq {\rm{rank}}(A) {\rm{tr}}(A^2)

and the result follows. \Box

Throughout this post, U(R) and J(R) are the group of units and the Jacobson radical of a ring R. Assuming that U(R) is finite and |U(R)| is odd, we will show that |U(R)|=\prod_{i=1}^k (2^{n_i}-1) for some positive integers k, n_1, \ldots , n_k. Let’s start with a nice little problem.

Problem 1. Prove that if U(R) is finite, then J(R) is finite too and |U(R)|=|J(R)||U(R/J(R)|.

Solution. Let J:=J(R) and define the map f: U(R) \to U(R/J)) by f(x) = x + J, \ x \in U(R). This map is clearly a well-defined group homomorphism. To prove that f is surjective, suppose that x + J \in U(R/J). Then 1-xy \in J, for some y \in R, and hence xy = 1-(1-xy) \in U(R) implying that x \in U(R). So f is surjective and thus U(R)/\ker f \cong U(R/J).
Now, \ker f = \{1-x : \ \ x \in J \} is a subgroup of U(R) and |\ker f|=|J|. Thus J is finite and |U(R)|=|\ker f||U(R/J)|=|J||U(R/J)|. \Box

Problem 2. Let p be a prime number and suppose that U(R) is finite and pR=(0). Prove that if p \nmid |U(R)|, then J(R)=(0).

Solution. Suppose that J(R) \neq (0) and 0 \neq x \in J(R). Then, considering J(R) as an additive group, H:=\{ix: \ 0 \leq i \leq p-1 \} is a subgroup of J(R) and so p=|H| \mid |J(R)|. But then p \mid |U(R)|, by Problem 1, and that’s a contradiction! \Box

There is also a direct, and maybe easier, way to solve Problem 2: suppose that there exists 0 \neq x \in J(R). On U(R), define the relation \sim as follows: y \sim z if and only if y-z = nx for some integer n. Then \sim is an equivalence relation and the equivalence class of y \in U(R) is [y]=\{y+ix: \ 0 \leq i \leq p-1 \}. Note that [y] \subseteq U(R) because x \in J(R) and y \in U(R). So if k is the number of equivalence classes, then |U(R)|=k|[y]|=kp, contradiction!

Problem 3. Prove that if F is a finite field, then |U(M_n(F))|=\prod_{i=1}^n(|F|^n - |F|^{i-1}). In particular, if |U(M_n(F))| is odd,  then n=1 and |F| is a power of 2.

Solution. The group U(M_n(F))= \text{GL}(n,F) is isomorphic to the group of invertible linear maps F^n \to F^n. Also, there is a one-to-one correspondence between the set of invertible linear maps F^n \to F^n and the set of (ordered) bases of F^n. So |U(M_n(F))| is equal to the number of bases of F^n. Now, to construct a basis for F^n, we choose any non-zero element v_1 \in F^n. There are |F|^n-1 different ways to choose v_1. Now, to choose v_2, we need to make sure that v_1,v_2 are not linearly dependent, i.e. v_2 \notin Fv_1 \cong F. So there are |F|^n-|F| possible ways to choose v_2. Again, we need to choose v_3 somehow that v_1,v_2,v_3 are not linearly dependent, i.e. v_3 \notin Fv_1+Fv_2 \cong F^2. So there are |F|^n-|F|^2 possible ways to choose v_3. If we continue this process, we will get the formula given in the problem. \Box

Problem 4. Suppose that U(R) is finite and |U(R)| is odd. Prove that |U(R)|=\prod_{i=1}^k (2^{n_i}-1) for some positive integers k, n_1, \ldots , n_k.

Solution. If 1 \neq -1 in R, then \{1,-1\} would be a subgroup of order 2 in U(R) and this is not possible because |U(R)| is odd. So 1=-1. Hence 2R=(0) and \mathbb{Z}/2\mathbb{Z} \cong \{0,1\} \subseteq R. Let S be the ring generated by \{0,1\} and U(R). Obviously S is finite, 2S=(0) and U(S)=U(R). We also have J(S)=(0), by Problem 2. So S is a finite semisimple ring and hence S \cong \prod_{i=1}^k M_{m_i}(F_i) for some positive integers k, m_1, \ldots , m_k and some finite fields F_1, \ldots , F_k, by the Artin-Wedderburn theorem and Wedderburn’s little theorem. Therefore |U(R)|=|U(S)|=\prod_{i=1}^k |U(M_{m_i}(F_i))|. The result now follows from the second part of Problem 3. \Box

See part (1) here! Again, we will assume that R is a PID and x is a varibale over x. In this post, we will take a look at the maximal ideals of R[x]. Let I be a maximal ideal of R[x]. By Problem 2, if I \cap R \neq (0), then I=\langle p, f(x) \rangle for some prime p \in R and some f(x) \in R[x] which is irreducible modulo p. If I \cap R =(0), then I=\langle f(x) \rangle for some irreducible element f(x) \in R[x]. Before investigating maximal ideals of R[x] in more details, let’s give an example of a PID R which is not a field but R[x] has a maximal ideal I which is principal. We will see in Problem 3 that this situation may happen only when the number of prime elements of R is finite.

Example 1. Let F be a filed and put R=F[[t]], the formal power series in the variable t over F. Let x be a variable over R. Then I:=\langle xt - 1 \rangle is a maximal ideal of R[x].

Proof. See that R[x]/I \cong F[[t,t^{-1}]] and that F[[t,t^{-1}]] is the field of fractions of R. Thus R[x]/I is a field and so I is a maximal ideal of R[x]. \ \Box

Problem 3. Prove that if R has infinitely many prime elements, then an ideal I of R[x] is maximal if and only if I=\langle p, f(x) \rangle for some prime p \in R and some f(x) \in R[x] which is irreducible modulo p.

Solution. We have already proved one direction of the problem in Problem 1. For the other direction, let I be a maximal ideal of R[x]. By the first case in the solution of Problem 2 and the second part of Problem 1, we  only need to show that I \cap R \neq (0). So suppose to the contrary that I \cap R=(0). Then, by the second case in the solution of Problem 2, I=\langle f(x) \rangle for some f(x) \in R[x]. We also know that R[x]/I is a field because I is a maximal ideal of R[x]. Since R has infinitely many prime elements, we can choose a prime p \in R such that p does not divide the leading coefficient of f(x). Now, consider the natural ring homomorphism \psi : R[x] \to R[x]/I. Since I \cap R=(0), \psi(p) \neq 0 and so \psi(p) is invertible in R[x]/I. Therefore pg(x)-1 \in \ker \psi = I for some g(x) \in R[x]. Hence pg(x)-1=h(x)f(x) for some h(x) \in R[x]. If p \mid h(x), then we will have p \mid 1 which is non-sense. So h(x)=pu(x) + v(x) for some u(x),v(x) \in R[x] where p does not divide the leading coefficient of v(x). Now pg(x) - 1 =h(x)f(x) gives us p(g(x)-u(x)f(x)) - 1 =v(x)f(x) and so the leading coefficient of v(x)f(x) is divisible by p. Hence the leading coefficient of f(x) must be divisible by p, contradiction! \Box

Example 2. The ring of integers \mathbb{Z} is a PID and it has infinitely many prime elements. So, by Problem 3, an ideal I of \mathbb{Z}[x] is maximal if and only if I=\langle p, f(x) \rangle for some prime p \in \mathbb{Z} and some f(x) which is irreducible modulo p. By Problem 2, the prime ideals of \mathbb{Z}[x] are the union of the following sets:
1) all maximal ideals
2) all ideals of the form \langle p \rangle, where p \in \mathbb{Z} is a prime
3) all ideals of the form \langle f(x) \rangle, where f(x) is irreducible in \mathbb{Z}[x].