Archive for the ‘Linear Algebra’ Category

Problem. Let A \in M_3(\mathbb{R}) be orthogonal and suppose that \det(A)=-1. Find \det(A-I).

Solution. Since A is orthogonal, its eigenvalues have absolute value 1 and it can be be diagonalized. Let D be a diagonal matrix such that PDP^{-1}=A for some invertible matrix P \in M_3(\mathbb{C}). Then

\det(D)=\det(A)=-1, \ \ \det(D-I)=\det(A-I).

We claim that the eigenvalues of A are \{-1,e^{i\theta},e^{-i\theta}\} for some \theta. Well, the characteristic polynomial of A has degree three and so it has either three real roots or only one real root. Also, the complex conjugate of a root of a polynomial with real coefficients is also a root. So, since \det(A)=-1, the eigenvalues of A are either all -1, which is the case \theta=\pi, or two of them are 1 and one is -1, which is the case \theta = 0, or one is -1 and the other two are in the form \{e^{i\theta}, e^{-i\theta}\} for some \theta. So

\displaystyle \begin{aligned} \det(D-I)=\det(A-I)=(-2)(e^{i\theta}-1)(e^{-i\theta}-1)=-4(1-\cos \theta). \ \Box \end{aligned}

Note that given \theta, the matrix

\displaystyle A=\begin{pmatrix} -1 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta \\ 0 & \sin \theta & \cos \theta \end{pmatrix}

is orthogonal, \det(A)=-1 and \det(A-I)=-4(1-\cos \theta).



The following problem is from the American Mathematical Monthly. The problem only asks the reader to calculate \displaystyle \lim_{n\to\infty} A_n^n, it doesn’t give the answer; I added the answer myself.

Problem (Furdui, Romania). Let a,b,c,d be real numbers with bc > 0. For an integer n \ge 1, let

A_n:=\begin{bmatrix} \cos \left(\frac{a}{n}\right) & \sin \left(\frac{b}{n}\right) \\ \\ \sin \left(\frac{c}{n}\right) & \cos \left(\frac{d}{n}\right) \end{bmatrix}.

Let \text{sgn(x)} be the sign function. Show that

\displaystyle \lim_{n\to\infty} A_n^n = \begin{bmatrix} \cosh(\sqrt{bc}) & \text{sgn}(b)\sqrt{\frac{b}{c}} \sinh(\sqrt{bc}) \\ \\ \text{sgn}(b)\sqrt{\frac{c}{b}} \sinh(\sqrt{bc}) & \cosh(\sqrt{bc}) \end{bmatrix}.

Solution. The characteristic polynomial of A_n is

\displaystyle p_n(x)=x^2-\left(\cos \left(\frac{a}{n}\right)+\cos\left(\frac{d}{n}\right) \right)x+\cos\left(\frac{a}{n}\right)\cos\left(\frac{d}{n}\right)-\sin \left(\frac{b}{n} \right)\sin \left(\frac{c}{n} \right).

And roots of p_n(x) are

\displaystyle r_n=\frac{\cos \left(\frac{a}{n}\right)+\cos \left(\frac{d}{n}\right) + \sqrt{\Delta_n}}{2}, \ \ s_n=\frac{\cos \left(\frac{a}{n}\right) + \cos \left(\frac{d}{n}\right) - \sqrt{\Delta_n}}{2},


\displaystyle \Delta_n:=\left(\cos \left(\frac{a}{n}\right)-\cos \left(\frac{d}{n}\right)\right)^2+4 \sin \left(\frac{b}{n} \right) \sin \left(\frac{c}{n}\right).

If n is sufficiently large, which is the case we are interested in, then, since b,c are either both positive or both negative (because bc > 0), \displaystyle \sin \left(\frac{b}{n} \right) \sin \left(\frac{c}{n}\right) > 0 and so \Delta_n > 0. So, in this case, r_n,s_n are distinct real numbers and hence A_n is diagonalizable in \mathbb{R}.
Now v \in \mathbb{R}^2 is an eigenvector corresponding to r_n if and only if (A_n-r_nI)v=0 if and only if \displaystyle v \in \mathbb{R} \begin{bmatrix} 1 \\ \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix}. Similarly, v \in \mathbb{R}^2 is an eigenvector corresponding to s_n if and only if (A_n-s_nI)v=0 if and only if \displaystyle v \in \mathbb{R} \begin{bmatrix} 1 \\ \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix}. So if

\displaystyle P:=\begin{bmatrix} 1 & 1 \\  \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} & \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix}, \ D:=\begin{bmatrix} r_n & 0 \\ 0 & s_n \end{bmatrix},

then A_n=PDP^{-1} and hence

\displaystyle A_n^n=PD^nP^{-1}=\frac{\sin \left(\frac{b}{n}\right)}{s_n-r_n}\begin{bmatrix} 1 & 1 \\  \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} & \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix} \begin{bmatrix}r_n^n & 0 \\ 0 & s_n^n \end{bmatrix}\begin{bmatrix} \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} & -1 \\  \frac{\cos \left(\frac{a}{n}\right)-r_n}{\sin \left(\frac{b}{n}\right)} & 1 \end{bmatrix}.

The rest of the solution is just Calculus and if you have trouble finding limits, see this post in my Calculus blog for details. We have

\displaystyle  \lim_{n\to\infty} \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)}=-\lim_{n\to\infty} \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)}=\text{sgn}(b)\sqrt{\frac{c}{b}}, \\ \lim_{n\to\infty} \frac{\sin \left(\frac{b}{n}\right)}{r_n-s_n}=\frac{\text{sgn}(b)}{2}\sqrt{\frac{b}{c}}, \lim_{n\to\infty}r_n^n=e^{\sqrt{bc}}, \ \ \lim_{n\to\infty}s_n^n =e^{-\sqrt{bc}}. \ \Box

For a \in \mathbb{C} let \overline{a} denote the complex conjugate of a. Recall that a matrix [a_{ij}] \in M_n(\mathbb{C}) is called Hermitian if a_{ij}=\overline{a_{ji}}, for all 1 \leq i,j \leq n. It is known that if A is Hermitian, then A is diagonalizable and every eigenvalue of A is a real number. In this post, we will give a lower bound for the rank of a Hermitian matrix. To find the lower bound, we first need an easy inequality.

Problem 1. Prove that if a_1, \ldots , a_m \in \mathbb{R}, then (a_1 + \ldots + a_m)^2 \leq m(a_1^2 + \ldots + a_m^2).

Solution.  We have a^2+b^2 \geq 2ab for all a,b \in \mathbb{R} and so

(m-1)\sum_{i=1}^m a_i^2=\sum_{1 \leq i < j \leq m}(a_i^2+a_j^2) \geq \sum_{1 \leq i < j \leq m}2a_ia_j.

Adding the term \sum_{i=1}^m a_i^2 to both sides of the above inequality will finish the job. \Box

Problem 2. Prove that if 0 \neq A \in M_n(\mathbb{C}) is Hermitian, then {\rm{rank}}(A) \geq ({\rm{tr}}(A))^2/{\rm{tr}}(A^2).

Solution. Let \lambda_1, \ldots , \lambda_m be the nonzero eigenvalues of A. Since A is diagonalizable, we have {\rm{rank}}(A)=m. We also have {\rm{tr}}(A)=\lambda_1 + \ldots + \lambda_m and {\rm{tr}}(A^2)=\lambda_1^2 + \ldots + \lambda_m^2. Thus, by Problem 1,

({\rm{tr}}(A))^2 \leq {\rm{rank}}(A) {\rm{tr}}(A^2)

and the result follows. \Box

Throughout n \geq 1 is an ineteger, V = \mathbb{C}^n, and A is an n \times n matrix with entries in \mathbb{C}.

Definition 1. The complex conjugate of A=[a_{ij}] is the matrix \overline{A}=[\overline{a_{ij}}], where \overline{a_{ij}} is the complex conjugate of a_{ij}. Let A^* be the transpose of \overline{A}.

Remark 1. Recall that the standard inner product over the \mathbb{C}-vector space V = \mathbb{C}^n is defined by \langle u,v \rangle = \sum_{i=1}^n u_i \overline{v_i}, for all vectors u,v \in V, where u_i,v_i are the i-th entry of u,v respectively.

Remark 2. A direct and easy computation shows that \langle Au , v \rangle = \langle u , A^*v \rangle, for all u,v \in V. Since obviously (A^*)^*=A, we also have \langle A^*u, v \rangle = \langle u, Av \rangle, for all u, v \in V.

Lemma. Let T and T^* be the linear transformations corresponding  to A and A^* respectively, i.e. T(v) = Av and T^*(v)=A^*v, for all v \in V = \mathbb{C}^n. Then \text{Im}(T) \oplus \ker(T^*)=V.

Proof. We first remind the reader a couple of things from elementary linear algebra: if W is a subspace of V, then W^{\perp} = \{v \in V : \ \langle v, w \rangle = 0, \ \forall w \in W \}, where the inner product is the one defined in Remark 1. Then W^{\perp} is a subspace of V and W \oplus W^{\perp} = V. So, to prove the lemma, we let W = \text{Im}(T) and we will show that W^{\perp} = \ker(T^*). This is easy to do: v \in W^{\perp} if and only if \langle v,w \rangle = 0, for all w \in W. So, since W=\{ T(u): \ u \in V \}, we have v \in W^{\perp} if and only if \langle v, T(u) \rangle = 0, for all u \in V. But, by Remark 2, we have \langle v, T(u) \rangle = \langle T^*(v), u \rangle. So v \in W^{\perp} if and only if \langle T^*(v), u \rangle = 0, for all u \in V. Thus v \in W^{\perp} if and only if T^*(v) = 0, i.e. v \in \ker(T^*). \ \Box

Definition 2. If A is invertible and A^{-1}=A^*, then A is called unitary.

Remark 3. If A is unitary, then ||Au|| =||u||, for all u \in V = \mathbb{C}^n. The reason is that, by Remark 2 and the above definition, we have ||Au||^2=\langle Au, Au \rangle = \langle u, A^*Au \rangle = \langle u,u \rangle = ||u||^2. Using induction, we get ||A^iu||=||u||, for all integers i and all u \in V.

Problem. Suppose that A is unitary and v \in V = \mathbb{C}^n. Evaluate L=\displaystyle \lim_{m \to \infty} \frac{1}{m} \sum_{i=1}^m A^iv.

Solution. Let T and T^* be the linear transformations corresponding to A and A^* respectively and let I be the identity matrix. Apply the above lemma to the matrix I - A to get \text{Im}(\text{id}-T) \oplus \ker(\text{id}-T^*)=V, where \text{id} is the identity transformation. So there exist (unique) v_1 \in \text{Im}(\text{id}-T) and v_2 \in \ker(\text{id}-T^*) such that v=v_1+v_2.  Let

\displaystyle L_j = \lim_{m\to\infty} \frac{1}{m} \sum_{i=1}^m A^i v_j, \ j=1,2.

Then L=L_1+L_2. Let’s find L_1 first. Since v_1 \in \text{Im}(\text{id}-T), we have v_1 = u - T(u)=u-Au, for some u \in V. Thus \sum_{i=1}^m A^iv_1 = Au-A^{m+1}u and so, by Remark 3,

|| \sum_{i=1}^m A^i v_1|| = ||Au - A^{m+1}u|| \leq ||Au|| + ||A^{m+1}u||=2||u||.


\displaystyle || \frac{1}{m} \sum_{i=1}^m A^iv_1 || \leq \frac{2||u||}{m}

and therefore L_1=0. We are now going to find L_2. Since v_2 \in \ker(\text{id}-T^*), we have T^*(v_2)=v_2. Thus A^{-1}v_2=A^*v_2=v_2 and so Av_2=v_2. Hence A^iv_2=v_2 for all integers i and so \displaystyle \frac{1}{m} \sum_{i=1}^m A^iv_2 = v_2. Thus L_2=v_2 and hence L=L_1+L_2=v_2. \ \Box

Problem. (Fisher’s inequality) Suppose that S_1, S_2, \cdots , S_m are distinct subsets of \{1,2, \cdots , n \}. Prove that if there exists 1 \leq k < n such that |S_i \cap S_j|=k for all i \neq j, then m \leq n.

Solution. Define the n \times m matrix A=[a_{ij}] by

a_{ij}=\begin{cases}1 & \text{if} \ i \in S_j \\ 0 & \text{if} \ i \notin S_j \end{cases}.

Let \bold{x}_j \in \mathbb{R}^n, \ 1 \leq j \leq m, be the j-th column of A. Clearly every entry in \bold{x}_j is either one or zero and the number of ones in every \bold{x}_j is equal to |S_j|. Thus for every 1 \leq j \leq m

||\bold{x}_j||^2=\bold{x}_j \cdot \bold{x}_j = |S_j| \geq k \ \ \ \ \ \ \ \ \ \ \ (1)

and for all i \neq j

\bold{x}_i \cdot \bold{x}_j = |S_i \cap S_j|=k. \ \ \ \ \ \ \ \ \ \ \ \ \ (2)

Now, suppose, to the contrary, that m > n. Then \{\bold{x}_1, \cdots , \bold{x}_m \} will be linearly dependent over \mathbb{R} because these vectors are in \mathbb{R}^n. So, there exist r_1, \cdots, r_m \in \mathbb{R} such that r_j \neq 0 for some 1 \leq j \leq m and

r_1 \bold{x}_1 + \cdots + r_m \bold{x}_m=\bold{0}. \ \ \ \ \ \ \ \ \ \ \ \ \ \ (3)

Squaring both sides of (3) give us

\sum_{j=1}^m r_j^2 ||\bold{x}_j||^2 + 2 \sum_{i \neq j} r_ir_j \bold{x}_i \cdot \bold{x}_j = 0.

Thus by (2), \sum_{j=1}^m r_j^2|| \bold{x}_j||^2 + 2k \sum_{i \neq j}r_ir_j=0, which gives us

\sum_{j=1}^m r_j^2(||\bold{x}_j||^2 - k) + k \left (\sum_{j=1}^m r_j \right)^2=0. \ \ \ \ \ \ \ \ \ \ \ \ (4)

By (1), each term in (4) is non-negative and so they add up to zero if and only if each of them is zero. Hence

r_j(||\bold{x}_j||^2 - k)=0, \ \forall \ j, \ \ \sum_{j=1}^m r_j = 0. \ \ \ \ \ \ \ \ \ \ \ \ (5)

We know that not all r_j are zero. Now, the second relation in (5) tells us that at least two of r_j are non-zero. So suppose r_u \neq 0 and r_v \neq 0. Then the first relation in (5) gives us ||\bold{x}_u||^2=||\bold{x}_v||^2=k. Thus, by (1) and (2), |S_u|=|S_v|=|S_u \cap S_v|, which implies S_u=S_v. This is a contradiction because S_1, \cdots , S_m are assumed to be distinct. This contradiction proves that m \leq n. \ \Box

Problem. Let A and B be n \times n matrices with entries from some field k. Let c_1, \cdots , c_{n+1} be n+1 distinct elements  in k such that A+c_1B, \cdots , A+c_{n+1}B are all nilpotent. Prove that A and B are nilpotent.

Solution. Since A + c_1B, \cdots , A+c_{n+1}B are nilpotent, their eigenvalues are all zero and hence, by the Cayley-Hamilton theorem, (A+c_iB)^n=0, for all 1 \leq i \leq n+1. Let x be an indeterminate. So the equation

(A+xB)^n = 0 \ \ \ \ \ \ \ \ \ \ \ \ (1)

has at least n+1 roots x =c_1, \cdots , c_{n+1} in k. We will show that this is not possible unless A^n=B^n=0, which will complete the solution.  To do so, let’s expand the left hand side in (1) to get

B^nx^n + D_1x^{n-1} + \cdots + D_{n-1}x + A^n = 0, \ \ \ \ \ \ \ \ \ \ \ \ \ (2),

where each D_i is in the k-algebra generated by A and B. Let 1 \leq i,j \leq n and let a_{ij} and b_{ij} be the (i,j)-entries of A^n and B^n respectively. Then the (i,j)-entry of the matrix on the left hand side of (2) is

b_{ij}x^n + d_1x^{n-1} + \cdots + d_{n-1}x + a_{ij}=0, \ \ \ \ \ \ \ \ \ \ \ \ \ (3)

where d_1, \cdots , d_{n-1} are the (i,j)-entries of D_1, \cdots , D_{n-1}. So (3) is telling us that the polynomial p(x) = b_{ij}x^n + d_1x^{n-1} + \cdots + d_{n-1}x + a_{ij} \in k[x], which has degree at most n, has at least n+1 roots in k. This is not possible unless p(x)=0. Thus a_{ij}=b_{ij}=0. \ \Box

We will keep our notations in part (1). We now generalize the result proved in Problem 2.

Problem. Let A \in M_n(k) with \det A = 1. Prove that A is a product of elementary matrices.

Solution. Suppose that A = (a_{ij}), i.e. the (i,j)-entry of A is a_{ij}. The first column of A cannot be all zero because then \det A = 0, which is not true. So if a_{11}=0, then choosing a_{i1} \neq 0, we will have the matrix E_{1i}(1)A whose (1,1)-entry is a_{i1} \neq 0. Then we can replace A by E_{1i}(1)A and so we may assume that a_{11} \neq 0. Now let B_1=\prod_{i=2}^n E_{i1}(-a_{11}^{-1}a_{i1}) and C_1= \prod_{j=2}^n E_{1j}(-a_{11}^{-1}a_{1j}). Then the first row and the first column entries of B_1AC_1 are all zero except for the (1,1)-entry which is a_{11}. We can now do the same with the second row and the second column and continue this process until we get matrices B_1, \cdots , B_{n-1} and C_1, \cdots , C_{n-1} each of which is a product of elementary matrices and

D=B_{n-1}B_{n-2} \cdots B_1 A C_1 C_2 \cdots C_{n-1}

is a diagonal matrix. Note that \det D=1 because \det B_i=\det C_i=1 for all i (see the first part of Problem 1) and \det A=1. Thus by Problem 2, D is a product of elementary matrices. Hence

A=B_1^{-1} \cdots B_{n-1}^{-1}D C_{n-1}^{-1} \cdots C_1^{-1}

is also a product of elementary matrices, by the second part of Problem 1. \Box

Definition 1. The general linear group of order n over a field k is

GL(n,k)=\{A \in M_n(k): \ \det A \neq 0 \}.

Definition 2. The special linear group of order n over a field k is

SL(n,k)=\{A \in M_n(k): \ \det A=1 \}.

Remark 1. It is clear that GL(n,k) is a group with matrix multiplication and that SL(n,k) is a subgroup of GL(n,k). If, as usual, we let k^{\times}:=k \setminus \{0\} be the multiplicative group of the field k, then we can define the group homomorphism f : GL(n,k) \longrightarrow k^{\times} by f(A)=\det A. Obviously f is onto and \ker f = SL(n,k). So SL(n,k) is a normal subgroup of GL(n,k) and GL(n,k)/SL(n,k) \cong k^{\times}.

Remark 2. By the first part of Problem 1, every elementary matrix is in SL(n,k). Thus any product of elementary matrices is also in SL(n,k). On the other hand, by the above problem, every element of SL(n,k) is a product of elementary matrices. So we get the following important result:

SL(n,k), as a group, is generated by all elementary matrices.