Archive for the ‘Linear Algebra’ Category

For a \in \mathbb{C} let \overline{a} denote the complex conjugate of a. Recall that a matrix [a_{ij}] \in M_n(\mathbb{C})  is called Hermitian if a_{ij}=\overline{a_{ji}}, for all 1 \leq i,j \leq n. It is known that if A is Hermitian, then A is diagonalizable  and every eigenvalue of A is a real number. In this post, we will give a lower bound for the rank of a Hermitian matrix. In fact, the lower bound holds for any diagonalizable complex matrix whose eigenvalues are real numbers. To find the lower bound, we first need an easy inequality.

Problem 1. Prove that if a_1, \ldots , a_m \in \mathbb{R}, then (a_1 + \ldots + a_m)^2 \leq m(a_1^2 + \ldots + a_m^2).

Solution.  We have a^2+b^2 \geq 2ab for all a,b \in \mathbb{R} and so

(m-1)\sum_{i=1}^m a_i^2=\sum_{1 \leq i < j \leq m}(a_i^2+a_j^2) \geq \sum_{1 \leq i < j \leq m}2a_ia_j.

Adding the term \sum_{i=1}^m a_i^2 to both sides of the above inequality will finish the job. \Box

Problem 2. Prove that if 0 \neq A \in M_n(\mathbb{C}) is Hermitian, then {\rm{rank}}(A) \geq ({\rm{tr}}(A))^2/{\rm{tr}}(A^2).

Solution. Let \lambda_1, \ldots , \lambda_m be the nonzero eigenvalues of A. Since A is diagonalizable, we have {\rm{rank}}(A)=m. We also have {\rm{tr}}(A)=\lambda_1 + \ldots + \lambda_m and {\rm{tr}}(A^2)=\lambda_1^2 + \ldots + \lambda_m^2. Thus, by Problem 1,

({\rm{tr}}(A))^2 \leq {\rm{rank}}(A) {\rm{tr}}(A^2)

and the result follows. \Box

Throughout n \geq 1 is an ineteger, V = \mathbb{C}^n, and A is an n \times n matrix with entries in \mathbb{C}.

Definition 1. The complex conjugate of A=[a_{ij}] is the matrix \overline{A}=[\overline{a_{ij}}], where \overline{a_{ij}} is the complex conjugate of a_{ij}. Let A^* be the transpose of \overline{A}.

Remark 1. Recall that the standard inner product over the \mathbb{C}-vector space V = \mathbb{C}^n is defined by \langle u,v \rangle = \sum_{i=1}^n u_i \overline{v_i}, for all vectors u,v \in V, where u_i,v_i are the i-th entry of u,v respectively.

Remark 2. A direct and easy computation shows that \langle Au , v \rangle = \langle u , A^*v \rangle, for all u,v \in V. Since obviously (A^*)^*=A, we also have \langle A^*u, v \rangle = \langle u, Av \rangle, for all u, v \in V.

Lemma. Let T and T^* be the linear transformations corresponding  to A and A^* respectively, i.e. T(v) = Av and T^*(v)=A^*v, for all v \in V = \mathbb{C}^n. Then \text{Im}(T) \oplus \ker(T^*)=V.

Proof. We first remind the reader a couple of things from elementary linear algebra: if W is a subspace of V, then W^{\perp} = \{v \in V : \ \langle v, w \rangle = 0, \ \forall w \in W \}, where the inner product is the one defined in Remark 1. Then W^{\perp} is a subspace of V and W \oplus W^{\perp} = V. So, to prove the lemma, we let W = \text{Im}(T) and we will show that W^{\perp} = \ker(T^*). This is easy to do: v \in W^{\perp} if and only if \langle v,w \rangle = 0, for all w \in W. So, since W=\{ T(u): \ u \in V \}, we have v \in W^{\perp} if and only if \langle v, T(u) \rangle = 0, for all u \in V. But, by Remark 2, we have \langle v, T(u) \rangle = \langle T^*(v), u \rangle. So v \in W^{\perp} if and only if \langle T^*(v), u \rangle = 0, for all u \in V. Thus v \in W^{\perp} if and only if T^*(v) = 0, i.e. v \in \ker(T^*). \ \Box

Definition 2. If A is invertible and A^{-1}=A^*, then A is called unitary.

Remark 3. If A is unitary, then ||Au|| =||u||, for all u \in V = \mathbb{C}^n. The reason is that, by Remark 2 and the above definition, we have ||Au||^2=\langle Au, Au \rangle = \langle u, A^*Au \rangle = \langle u,u \rangle = ||u||^2. Using induction, we get ||A^iu||=||u||, for all integers i and all u \in V.

Problem. Suppose that A is unitary and v \in V = \mathbb{C}^n. Evaluate L=\displaystyle \lim_{m \to \infty} \frac{1}{m} \sum_{i=1}^m A^iv.

Solution. Let T and T^* be the linear transformations corresponding to A and A^* respectively and let I be the identity matrix. Apply the above lemma to the matrix I - A to get \text{Im}(\text{id}-T) \oplus \ker(\text{id}-T^*)=V, where \text{id} is the identity transformation. So there exist (unique) v_1 \in \text{Im}(\text{id}-T) and v_2 \in \ker(\text{id}-T^*) such that v=v_1+v_2.  Let

\displaystyle L_j = \lim_{m\to\infty} \frac{1}{m} \sum_{i=1}^m A^i v_j, \ j=1,2.

Then L=L_1+L_2. Let’s find L_1 first. Since v_1 \in \text{Im}(\text{id}-T), we have v_1 = u - T(u)=u-Au, for some u \in V. Thus \sum_{i=1}^m A^iv_1 = Au-A^{m+1}u and so, by Remark 3,

|| \sum_{i=1}^m A^i v_1|| = ||Au - A^{m+1}u|| \leq ||Au|| + ||A^{m+1}u||=2||u||.


\displaystyle || \frac{1}{m} \sum_{i=1}^m A^iv_1 || \leq \frac{2||u||}{m}

and therefore L_1=0. We are now going to find L_2. Since v_2 \in \ker(\text{id}-T^*), we have T^*(v_2)=v_2. Thus A^{-1}v_2=A^*v_2=v_2 and so Av_2=v_2. Hence A^iv_2=v_2 for all integers i and so \displaystyle \frac{1}{m} \sum_{i=1}^m A^iv_2 = v_2. Thus L_2=v_2 and hence L=L_1+L_2=v_2. \ \Box

Problem. (Fisher’s inequality) Suppose that S_1, S_2, \cdots , S_m are distinct subsets of \{1,2, \cdots , n \}. Prove that if there exists 1 \leq k < n such that |S_i \cap S_j|=k for all i \neq j, then m \leq n.

Solution. Define the n \times m matrix A=[a_{ij}] by

a_{ij}=\begin{cases}1 & \text{if} \ i \in S_j \\ 0 & \text{if} \ i \notin S_j \end{cases}.

Let \bold{x}_j \in \mathbb{R}^n, \ 1 \leq j \leq m, be the j-th column of A. Clearly every entry in \bold{x}_j is either one or zero and the number of ones in every \bold{x}_j is equal to |S_j|. Thus for every 1 \leq j \leq m

||\bold{x}_j||^2=\bold{x}_j \cdot \bold{x}_j = |S_j| \geq k \ \ \ \ \ \ \ \ \ \ \ (1)

and for all i \neq j

\bold{x}_i \cdot \bold{x}_j = |S_i \cap S_j|=k. \ \ \ \ \ \ \ \ \ \ \ \ \ (2)

Now, suppose, to the contrary, that m > n. Then \{\bold{x}_1, \cdots , \bold{x}_m \} will be linearly dependent over \mathbb{R} because these vectors are in \mathbb{R}^n. So, there exist r_1, \cdots, r_m \in \mathbb{R} such that r_j \neq 0 for some 1 \leq j \leq m and

r_1 \bold{x}_1 + \cdots + r_m \bold{x}_m=\bold{0}. \ \ \ \ \ \ \ \ \ \ \ \ \ \ (3)

Squaring both sides of (3) give us

\sum_{j=1}^m r_j^2 ||\bold{x}_j||^2 + 2 \sum_{i \neq j} r_ir_j \bold{x}_i \cdot \bold{x}_j = 0.

Thus by (2), \sum_{j=1}^m r_j^2|| \bold{x}_j||^2 + 2k \sum_{i \neq j}r_ir_j=0, which gives us

\sum_{j=1}^m r_j^2(||\bold{x}_j||^2 - k) + k \left (\sum_{j=1}^m r_j \right)^2=0. \ \ \ \ \ \ \ \ \ \ \ \ (4)

By (1), each term in (4) is non-negative and so they add up to zero if and only if each of them is zero. Hence

r_j(||\bold{x}_j||^2 - k)=0, \ \forall \ j, \ \ \sum_{j=1}^m r_j = 0. \ \ \ \ \ \ \ \ \ \ \ \ (5)

We know that not all r_j are zero. Now, the second relation in (5) tells us that at least two of r_j are non-zero. So suppose r_u \neq 0 and r_v \neq 0. Then the first relation in (5) gives us ||\bold{x}_u||^2=||\bold{x}_v||^2=k. Thus, by (1) and (2), |S_u|=|S_v|=|S_u \cap S_v|, which implies S_u=S_v. This is a contradiction because S_1, \cdots , S_m are assumed to be distinct. This contradiction proves that m \leq n. \ \Box

Problem. Let A and B be n \times n matrices with entries from some field k. Let c_1, \cdots , c_{n+1} be n+1 distinct elements  in k such that A+c_1B, \cdots , A+c_{n+1}B are all nilpotent. Prove that A and B are nilpotent.

Solution. Since A + c_1B, \cdots , A+c_{n+1}B are nilpotent, their eigenvalues are all zero and hence, by the Cayley-Hamilton theorem, (A+c_iB)^n=0, for all 1 \leq i \leq n+1. Let x be an indeterminate. So the equation

(A+xB)^n = 0 \ \ \ \ \ \ \ \ \ \ \ \ (1)

has at least n+1 roots x =c_1, \cdots , c_{n+1} in k. We will show that this is not possible unless A^n=B^n=0, which will complete the solution.  To do so, let’s expand the left hand side in (1) to get

B^nx^n + D_1x^{n-1} + \cdots + D_{n-1}x + A^n = 0, \ \ \ \ \ \ \ \ \ \ \ \ \ (2),

where each D_i is in the k-algebra generated by A and B. Let 1 \leq i,j \leq n and let a_{ij} and b_{ij} be the (i,j)-entries of A^n and B^n respectively. Then the (i,j)-entry of the matrix on the left hand side of (2) is

b_{ij}x^n + d_1x^{n-1} + \cdots + d_{n-1}x + a_{ij}=0, \ \ \ \ \ \ \ \ \ \ \ \ \ (3)

where d_1, \cdots , d_{n-1} are the (i,j)-entries of D_1, \cdots , D_{n-1}. So (3) is telling us that the polynomial p(x) = b_{ij}x^n + d_1x^{n-1} + \cdots + d_{n-1}x + a_{ij} \in k[x], which has degree at most n, has at least n+1 roots in k. This is not possible unless p(x)=0. Thus a_{ij}=b_{ij}=0. \ \Box

We will keep our notations in part (1). We now generalize the result proved in Problem 2.

Problem. Let A \in M_n(k) with \det A = 1. Prove that A is a product of elementary matrices.

Solution. Suppose that A = (a_{ij}), i.e. the (i,j)-entry of A is a_{ij}. The first column of A cannot be all zero because then \det A = 0, which is not true. So if a_{11}=0, then choosing a_{i1} \neq 0, we will have the matrix E_{1i}(1)A whose (1,1)-entry is a_{i1} \neq 0. Then we can replace A by E_{1i}(1)A and so we may assume that a_{11} \neq 0. Now let B_1=\prod_{i=2}^n E_{i1}(-a_{11}^{-1}a_{i1}) and C_1= \prod_{j=2}^n E_{1j}(-a_{11}^{-1}a_{1j}). Then the first row and the first column entries of B_1AC_1 are all zero except for the (1,1)-entry which is a_{11}. We can now do the same with the second row and the second column and continue this process until we get matrices B_1, \cdots , B_{n-1} and C_1, \cdots , C_{n-1} each of which is a product of elementary matrices and

D=B_{n-1}B_{n-2} \cdots B_1 A C_1 C_2 \cdots C_{n-1}

is a diagonal matrix. Note that \det D=1 because \det B_i=\det C_i=1 for all i (see the first part of Problem 1) and \det A=1. Thus by Problem 2, D is a product of elementary matrices. Hence

A=B_1^{-1} \cdots B_{n-1}^{-1}D C_{n-1}^{-1} \cdots C_1^{-1}

is also a product of elementary matrices, by the second part of Problem 1. \Box

Definition 1. The general linear group of order n over a field k is

GL(n,k)=\{A \in M_n(k): \ \det A \neq 0 \}.

Definition 2. The special linear group of order n over a field k is

SL(n,k)=\{A \in M_n(k): \ \det A=1 \}.

Remark 1. It is clear that GL(n,k) is a group with matrix multiplication and that SL(n,k) is a subgroup of GL(n,k). If, as usual, we let k^{\times}:=k \setminus \{0\} be the multiplicative group of the field k, then we can define the group homomorphism f : GL(n,k) \longrightarrow k^{\times} by f(A)=\det A. Obviously f is onto and \ker f = SL(n,k). So SL(n,k) is a normal subgroup of GL(n,k) and GL(n,k)/SL(n,k) \cong k^{\times}.

Remark 2. By the first part of Problem 1, every elementary matrix is in SL(n,k). Thus any product of elementary matrices is also in SL(n,k). On the other hand, by the above problem, every element of SL(n,k) is a product of elementary matrices. So we get the following important result:

SL(n,k), as a group, is generated by all elementary matrices.

Let k be a field and let M_n(k) be the set of all n \times n matrices with entries from k. Let I \in M_n(k) be the identity matrix and let e_{ij} \in M_n(k) be the matrix whose (i,j)-entry is 1 and all other entries zero. We will denote by \text{diag} \{x_1, x_2, \cdots , x_n \} the diagonal matrix whose (i,i)-entry is x_i.

Definition. For every 1 \leq i \neq j \leq n and \alpha \in k define E_{ij}(\alpha)=I + \alpha e_{ij}. We will call E_{ij}(\alpha) an elementary matrix.

Remark. If A \in M_n(k), then clearly multiplying A on the left by E_{ij}(\alpha) takes A and adds on \alpha times the j-th row of A to the i-th row of A. Similarly multiplying A on the right by E_{ij}(\alpha) takes A and adds on \alpha times the i-th column of A to the j-th column of A.

Problem 1. Prove that

1) \det E_{ij}(\alpha)=1.

2) (E_{ij}(\alpha))^{-1}=E_{ij}(-\alpha).

Solution. The first part is obvious because E_{ij}(\alpha) is a triangular matrix and so its determinant is the product of its diagonal entries, which are all 1. To prove the second part of the problem, note that since i \neq j we have e_{ij}^2=0 and thus E_{ij}(\alpha) E_{ij}(-\alpha)=(I + \alpha e_{ij})(I - \alpha e_{ij})=I. \Box

Problem 2. Let A \in M_n(k) be a diagonal matrix with \det A = 1. Prove that A is a product of 5(n-1) of elementary matrices.

Solution. Let A=\text{diag} \{a_1, a_2, \cdots , a_n \}. Since \det A = 1, we have

a_1a_2 \cdots a_n = 1. \ \ \ \ \ \ \ \ \ \ (*)

In partcular, all a_i are non-zero and hence invertible in k. See that

E_{n,n-1}(1-a_n^{-1})E_{n-1,n}(-1)E_{n,n-1}(1)E_{n,n-1}(-a_n)E_{n-1,n}(a_n^{-1})A = \text{diag} \{a_1, a_2 , \cdots , a_{n-1}a_n, 1 \}.

As you see we have multiplied A on the left by 5 elementary matrices and we got a diagonal matrix whose last diagonal entry is 1. If we continue by induction, we will get a matrix B which is the product of 5(n-1) elementary matrices and

BA = \text{diag} \{a_1a_2 \cdots a_n, 1, \cdots , 1 \} = I,

by (*). Thus A=B^{-1} and the result follows from the second part of Problem 1. \Box

We denote by M_n(\mathbb{C}) the set of n \times n matrices with entries from \mathbb{C}. We will also denote by \text{adj}(Z) the adjugate of Z \in M_n(\mathbb{C}).

Remark. If X \in M_n(\mathbb{C}) is invertible, then \text{adj}(X^{-1}Y X)=X^{-1} \text{adj}(Y) X, for all Y \in M_n(\mathbb{C}).

Proof. Since X is invertible, we have \text{adj}(X) = (\det X) X^{-1}. The result now follows immediately from the fact that \text{adj}(MN)=\text{adj}(N) \text{adj}(M), for all M,N \in M_n(\mathbb{C}). \Box

Problem. Suppose that \lambda_1, \cdots , \lambda_n are the, not necessarily distinct,  eigenvalues of A \in M_n(\mathbb{C}). Prove that the eigenvalues of \text{adj}(A) are \mu_i = \prod_{j \neq i} \lambda_j, \ 1 \leq i \leq n.

Solution. We know that every element of M_n(\mathbb{C}) is similar to an upper triangular matrix. So there exists an invertible matrix P and an upper triangular matrix U such that

A=P^{-1}UP. \ \ \ \ \ \ \ \ \ \ \ \ \ (1)

So \lambda_1, \cdots , \lambda_n are also the eigenvalues of U because similar matrices have the same characteristic polynomials. Thus the diagonal entries of U are \lambda_1, \cdots , \lambda_n. Now let V = \text{adj}(U). Then,  by (1) and the above remark

\text{adj}(A)=P^{-1} V P. \ \ \ \ \ \ \ \ \ \ \ (2)

Since the diagonal entries of U are \lambda_1, \cdots , \lambda_n, using the definition of adjugate, it is easily seen that V is an upper triangular matrix with diagonal entries \mu_i = \prod_{j \neq i} \lambda_j. Hence the eigenvalues of V are \mu_1, \cdots , \mu_n. Since, by (2), \text{adj}(A) is similar to V, the eigenvalues of \text{adj}(A) are also \mu_1, \cdots , \mu_n. \ \Box

Example 1. If the eigenvalues of A \in M_3(\mathbb{C}) are \lambda_1, \lambda_2, \lambda_3, then the eigenvalues of \text{adj}(A) are \lambda_1 \lambda_2, \ \lambda_1 \lambda_3 and \lambda_2 \lambda_3.

Example 2. If A \in M_n(\mathbb{C}) has exactly one zero eigenvalue, say \lambda_1=0, then \text{adj}(A) has exactly one non-zero eigenvalue, which is \mu_1 = \lambda_2 \lambda_3 \cdots \lambda_n. If A has more than one zero eigenvalue, then all the eigenvalues of \text{adj}(A) are zero, i.e. A is nilpotent.