Archive for the ‘Linear Algebra’ Category

Definition. For an integer n \ge 1, the n\times n Hilbert matrix is defined by H_n=[a_{ij}], where

\displaystyle a_{ij}=\frac{1}{i+j-1}, \ \  1 \le i,j \le n.

It is known that H_n is invertible and if H_n^{-1}=[b_{ij}], then \displaystyle \sum_{i,j}b_{ij}=n^2. We are going to use these two properties of Hilbert matrices to solve the following calculus problem.

Problem. Let n \ge 1 be an integer and let f : [0,1] \longrightarrow \mathbb{R} be a continuous function. Suppose that \displaystyle \int_0^1 x^kf(x) \ dx = 1 for all  0 \le k \le n-1. Show that \displaystyle \int_0^1 (f(x))^2 dx \ge n^2.

Solution. Since H_n, the n\times n Hilbert matrix, is invertible, there exist real numbers p_0, p_1, \cdots , p_{n-1} such that

\displaystyle \sum_{i=1}^n\frac{p_{i-1}}{i+j-1}=1, \ \ \ 1 \le j \le n.

So the polynomial \displaystyle p(x)=\sum_{k=0}^{n-1}p_kx^k satisfies the conditions

\displaystyle \int_0^1x^k p(x) \ dx =1, \ \ \ 0 \le k \le n-1.

Clearly \displaystyle \sum_{k=0}^{n-1}p_k is the sum of all the entries of H_n^{-1} and so \displaystyle \sum_{k=0}^{n-1}p_k=n^2. Now let f be a real-valued continuous function on [0,1] such that

\displaystyle \int_0^1x^kf(x) \ dx  = 1, \ \ \ 0 \le k \le n-1.

Let p(x) be the above polynomial.Then since

\displaystyle (f(x))^2-2f(x)p(x)+(p(x))^2 =(f(x)-p(x))^2 \ge 0,

integrating gives

\displaystyle \begin{aligned} \int_0^1 (f(x))^2dx \ge 2\int_0^1f(x)p(x) \ dx -\int_0^1(p(x))^2dx=2\sum_{k=0}^{n-1}p_k \int_0^1 x^kf(x) \ dx - \\ \sum_{k=0}^{n-1}p_k\int_0^1x^kp(x) \ dx = 2\sum_{k=0}^{n-1}p_k-\sum_{k=0}^{n-1}p_k=\sum_{k=0}^{n-1}p_k =n^2. \ \Box \end{aligned}


Problem. Let A \in M_3(\mathbb{R}) be orthogonal and suppose that \det(A)=-1. Find \det(A-I).

Solution. Since A is orthogonal, its eigenvalues have absolute value 1 and it can be be diagonalized. Let D be a diagonal matrix such that PDP^{-1}=A for some invertible matrix P \in M_3(\mathbb{C}). Then

\det(D)=\det(A)=-1, \ \ \det(D-I)=\det(A-I).

We claim that the eigenvalues of A are \{-1,e^{i\theta},e^{-i\theta}\} for some \theta. Well, the characteristic polynomial of A has degree three and so it has either three real roots or only one real root. Also, the complex conjugate of a root of a polynomial with real coefficients is also a root. So, since \det(A)=-1, the eigenvalues of A are either all -1, which is the case \theta=\pi, or two of them are 1 and one is -1, which is the case \theta = 0, or one is -1 and the other two are in the form \{e^{i\theta}, e^{-i\theta}\} for some \theta. So

\displaystyle \begin{aligned} \det(D-I)=\det(A-I)=(-2)(e^{i\theta}-1)(e^{-i\theta}-1)=-4(1-\cos \theta). \ \Box \end{aligned}

Note that given \theta, the matrix

\displaystyle A=\begin{pmatrix} -1 & 0 & 0 \\ 0 & \cos \theta & -\sin \theta \\ 0 & \sin \theta & \cos \theta \end{pmatrix}

is orthogonal, \det(A)=-1 and \det(A-I)=-4(1-\cos \theta).


The following problem is from the American Mathematical Monthly. The problem only asks the reader to calculate \displaystyle \lim_{n\to\infty} A_n^n, it doesn’t give the answer; I added the answer myself.

Problem (Furdui, Romania). Let a,b,c,d be real numbers with bc > 0. For an integer n \ge 1, let

A_n:=\begin{bmatrix} \cos \left(\frac{a}{n}\right) & \sin \left(\frac{b}{n}\right) \\ \\ \sin \left(\frac{c}{n}\right) & \cos \left(\frac{d}{n}\right) \end{bmatrix}.

Let \text{sgn(x)} be the sign function. Show that

\displaystyle \lim_{n\to\infty} A_n^n = \begin{bmatrix} \cosh(\sqrt{bc}) & \text{sgn}(b)\sqrt{\frac{b}{c}} \sinh(\sqrt{bc}) \\ \\ \text{sgn}(b)\sqrt{\frac{c}{b}} \sinh(\sqrt{bc}) & \cosh(\sqrt{bc}) \end{bmatrix}.

Solution. The characteristic polynomial of A_n is

\displaystyle p_n(x)=x^2-\left(\cos \left(\frac{a}{n}\right)+\cos\left(\frac{d}{n}\right) \right)x+\cos\left(\frac{a}{n}\right)\cos\left(\frac{d}{n}\right)-\sin \left(\frac{b}{n} \right)\sin \left(\frac{c}{n} \right).

And roots of p_n(x) are

\displaystyle r_n=\frac{\cos \left(\frac{a}{n}\right)+\cos \left(\frac{d}{n}\right) + \sqrt{\Delta_n}}{2}, \ \ s_n=\frac{\cos \left(\frac{a}{n}\right) + \cos \left(\frac{d}{n}\right) - \sqrt{\Delta_n}}{2},


\displaystyle \Delta_n:=\left(\cos \left(\frac{a}{n}\right)-\cos \left(\frac{d}{n}\right)\right)^2+4 \sin \left(\frac{b}{n} \right) \sin \left(\frac{c}{n}\right).

If n is sufficiently large, which is the case we are interested in, then, since b,c are either both positive or both negative (because bc > 0), \displaystyle \sin \left(\frac{b}{n} \right) \sin \left(\frac{c}{n}\right) > 0 and so \Delta_n > 0. So, in this case, r_n,s_n are distinct real numbers and hence A_n is diagonalizable in \mathbb{R}.
Now v \in \mathbb{R}^2 is an eigenvector corresponding to r_n if and only if (A_n-r_nI)v=0 if and only if \displaystyle v \in \mathbb{R} \begin{bmatrix} 1 \\ \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix}. Similarly, v \in \mathbb{R}^2 is an eigenvector corresponding to s_n if and only if (A_n-s_nI)v=0 if and only if \displaystyle v \in \mathbb{R} \begin{bmatrix} 1 \\ \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix}. So if

\displaystyle P:=\begin{bmatrix} 1 & 1 \\  \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} & \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix}, \ D:=\begin{bmatrix} r_n & 0 \\ 0 & s_n \end{bmatrix},

then A_n=PDP^{-1} and hence

\displaystyle A_n^n=PD^nP^{-1}=\frac{\sin \left(\frac{b}{n}\right)}{s_n-r_n}\begin{bmatrix} 1 & 1 \\  \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} & \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} \end{bmatrix} \begin{bmatrix}r_n^n & 0 \\ 0 & s_n^n \end{bmatrix}\begin{bmatrix} \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)} & -1 \\  \frac{\cos \left(\frac{a}{n}\right)-r_n}{\sin \left(\frac{b}{n}\right)} & 1 \end{bmatrix}.

The rest of the solution is just Calculus and if you have trouble finding limits, see this post in my Calculus blog for details. We have

\displaystyle  \lim_{n\to\infty} \frac{r_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)}=-\lim_{n\to\infty} \frac{s_n-\cos \left(\frac{a}{n}\right)}{\sin \left(\frac{b}{n}\right)}=\text{sgn}(b)\sqrt{\frac{c}{b}}, \\ \lim_{n\to\infty} \frac{\sin \left(\frac{b}{n}\right)}{r_n-s_n}=\frac{\text{sgn}(b)}{2}\sqrt{\frac{b}{c}}, \lim_{n\to\infty}r_n^n=e^{\sqrt{bc}}, \ \ \lim_{n\to\infty}s_n^n =e^{-\sqrt{bc}}. \ \Box

For a \in \mathbb{C} let \overline{a} denote the complex conjugate of a. Recall that a matrix [a_{ij}] \in M_n(\mathbb{C}) is called Hermitian if a_{ij}=\overline{a_{ji}}, for all 1 \leq i,j \leq n. It is known that if A is Hermitian, then A is diagonalizable and every eigenvalue of A is a real number. In this post, we give a lower bound for the rank of a Hermitian matrix. To find the lower bound, we first need an easy inequality.

Problem 1. Prove that if a_1, \ldots , a_m \in \mathbb{R}, then (a_1 + \ldots + a_m)^2 \leq m(a_1^2 + \ldots + a_m^2).

Solution.  We have a^2+b^2 \geq 2ab for all a,b \in \mathbb{R} and so

(m-1)\sum_{i=1}^m a_i^2=\sum_{1 \leq i < j \leq m}(a_i^2+a_j^2) \geq \sum_{1 \leq i < j \leq m}2a_ia_j.

Adding the term \sum_{i=1}^m a_i^2 to both sides of the above inequality will finish the job. \Box

Problem 2. Prove that if 0 \neq A \in M_n(\mathbb{C}) is Hermitian, then {\rm{rank}}(A) \geq ({\rm{tr}}(A))^2/{\rm{tr}}(A^2).

Solution. Let \lambda_1, \ldots , \lambda_m be the nonzero eigenvalues of A. Since A is diagonalizable, we have {\rm{rank}}(A)=m. We also have {\rm{tr}}(A)=\lambda_1 + \ldots + \lambda_m and {\rm{tr}}(A^2)=\lambda_1^2 + \ldots + \lambda_m^2. Thus, by Problem 1,

({\rm{tr}}(A))^2 \leq {\rm{rank}}(A) {\rm{tr}}(A^2)

and the result follows. \Box

Throughout n \geq 1 is an ineteger, V = \mathbb{C}^n, and A is an n \times n matrix with entries in \mathbb{C}.

Definition 1. The complex conjugate of A=[a_{ij}] is the matrix \overline{A}=[\overline{a_{ij}}], where \overline{a_{ij}} is the complex conjugate of a_{ij}. Let A^* be the transpose of \overline{A}.

Remark 1. Recall that the standard inner product over the \mathbb{C}-vector space V = \mathbb{C}^n is defined by \langle u,v \rangle = \sum_{i=1}^n u_i \overline{v_i}, for all vectors u,v \in V, where u_i,v_i are the i-th entry of u,v respectively.

Remark 2. A direct and easy computation shows that \langle Au , v \rangle = \langle u , A^*v \rangle, for all u,v \in V. Since obviously (A^*)^*=A, we also have \langle A^*u, v \rangle = \langle u, Av \rangle, for all u, v \in V.

Lemma. Let T and T^* be the linear transformations corresponding  to A and A^* respectively, i.e. T(v) = Av and T^*(v)=A^*v, for all v \in V = \mathbb{C}^n. Then \text{Im}(T) \oplus \ker(T^*)=V.

Proof. We first remind the reader a couple of things from elementary linear algebra: if W is a subspace of V, then W^{\perp} = \{v \in V : \ \langle v, w \rangle = 0, \ \forall w \in W \}, where the inner product is the one defined in Remark 1. Then W^{\perp} is a subspace of V and W \oplus W^{\perp} = V. So, to prove the lemma, we let W = \text{Im}(T) and we will show that W^{\perp} = \ker(T^*). This is easy to do: v \in W^{\perp} if and only if \langle v,w \rangle = 0, for all w \in W. So, since W=\{ T(u): \ u \in V \}, we have v \in W^{\perp} if and only if \langle v, T(u) \rangle = 0, for all u \in V. But, by Remark 2, we have \langle v, T(u) \rangle = \langle T^*(v), u \rangle. So v \in W^{\perp} if and only if \langle T^*(v), u \rangle = 0, for all u \in V. Thus v \in W^{\perp} if and only if T^*(v) = 0, i.e. v \in \ker(T^*). \ \Box

Definition 2. If A is invertible and A^{-1}=A^*, then A is called unitary.

Remark 3. If A is unitary, then ||Au|| =||u||, for all u \in V = \mathbb{C}^n. The reason is that, by Remark 2 and the above definition, we have ||Au||^2=\langle Au, Au \rangle = \langle u, A^*Au \rangle = \langle u,u \rangle = ||u||^2. Using induction, we get ||A^iu||=||u||, for all integers i and all u \in V.

Problem. Suppose that A is unitary and v \in V = \mathbb{C}^n. Evaluate L=\displaystyle \lim_{m \to \infty} \frac{1}{m} \sum_{i=1}^m A^iv.

Solution. Let T and T^* be the linear transformations corresponding to A and A^* respectively and let I be the identity matrix. Apply the above lemma to the matrix I - A to get \text{Im}(\text{id}-T) \oplus \ker(\text{id}-T^*)=V, where \text{id} is the identity transformation. So there exist (unique) v_1 \in \text{Im}(\text{id}-T) and v_2 \in \ker(\text{id}-T^*) such that v=v_1+v_2.  Let

\displaystyle L_j = \lim_{m\to\infty} \frac{1}{m} \sum_{i=1}^m A^i v_j, \ j=1,2.

Then L=L_1+L_2. Let’s find L_1 first. Since v_1 \in \text{Im}(\text{id}-T), we have v_1 = u - T(u)=u-Au, for some u \in V. Thus \sum_{i=1}^m A^iv_1 = Au-A^{m+1}u and so, by Remark 3,

|| \sum_{i=1}^m A^i v_1|| = ||Au - A^{m+1}u|| \leq ||Au|| + ||A^{m+1}u||=2||u||.


\displaystyle || \frac{1}{m} \sum_{i=1}^m A^iv_1 || \leq \frac{2||u||}{m}

and therefore L_1=0. We are now going to find L_2. Since v_2 \in \ker(\text{id}-T^*), we have T^*(v_2)=v_2. Thus A^{-1}v_2=A^*v_2=v_2 and so Av_2=v_2. Hence A^iv_2=v_2 for all integers i and so \displaystyle \frac{1}{m} \sum_{i=1}^m A^iv_2 = v_2. Thus L_2=v_2 and hence L=L_1+L_2=v_2. \ \Box

Problem. (Fisher’s inequality) Suppose that S_1, S_2, \cdots , S_m are distinct subsets of \{1,2, \cdots , n \}. Prove that if there exists 1 \leq k < n such that |S_i \cap S_j|=k for all i \neq j, then m \leq n.

Solution. Define the n \times m matrix A=[a_{ij}] by

a_{ij}=\begin{cases}1 & \text{if} \ i \in S_j \\ 0 & \text{if} \ i \notin S_j \end{cases}.

Let \bold{x}_j \in \mathbb{R}^n, \ 1 \leq j \leq m, be the j-th column of A. Clearly every entry in \bold{x}_j is either one or zero and the number of ones in every \bold{x}_j is equal to |S_j|. Thus for every 1 \leq j \leq m

||\bold{x}_j||^2=\bold{x}_j \cdot \bold{x}_j = |S_j| \geq k \ \ \ \ \ \ \ \ \ \ \ (1)

and for all i \neq j

\bold{x}_i \cdot \bold{x}_j = |S_i \cap S_j|=k. \ \ \ \ \ \ \ \ \ \ \ \ \ (2)

Now, suppose, to the contrary, that m > n. Then \{\bold{x}_1, \cdots , \bold{x}_m \} will be linearly dependent over \mathbb{R} because these vectors are in \mathbb{R}^n. So, there exist r_1, \cdots, r_m \in \mathbb{R} such that r_j \neq 0 for some 1 \leq j \leq m and

r_1 \bold{x}_1 + \cdots + r_m \bold{x}_m=\bold{0}. \ \ \ \ \ \ \ \ \ \ \ \ \ \ (3)

Squaring both sides of (3) give us

\sum_{j=1}^m r_j^2 ||\bold{x}_j||^2 + 2 \sum_{i \neq j} r_ir_j \bold{x}_i \cdot \bold{x}_j = 0.

Thus by (2), \sum_{j=1}^m r_j^2|| \bold{x}_j||^2 + 2k \sum_{i \neq j}r_ir_j=0, which gives us

\sum_{j=1}^m r_j^2(||\bold{x}_j||^2 - k) + k \left (\sum_{j=1}^m r_j \right)^2=0. \ \ \ \ \ \ \ \ \ \ \ \ (4)

By (1), each term in (4) is non-negative and so they add up to zero if and only if each of them is zero. Hence

r_j(||\bold{x}_j||^2 - k)=0, \ \forall \ j, \ \ \sum_{j=1}^m r_j = 0. \ \ \ \ \ \ \ \ \ \ \ \ (5)

We know that not all r_j are zero. Now, the second relation in (5) tells us that at least two of r_j are non-zero. So suppose r_u \neq 0 and r_v \neq 0. Then the first relation in (5) gives us ||\bold{x}_u||^2=||\bold{x}_v||^2=k. Thus, by (1) and (2), |S_u|=|S_v|=|S_u \cap S_v|, which implies S_u=S_v. This is a contradiction because S_1, \cdots , S_m are assumed to be distinct. This contradiction proves that m \leq n. \ \Box

Problem. Let A and B be n \times n matrices with entries from some field k. Let c_1, \cdots , c_{n+1} be n+1 distinct elements  in k such that A+c_1B, \cdots , A+c_{n+1}B are all nilpotent. Prove that A and B are nilpotent.

Solution. Since A + c_1B, \cdots , A+c_{n+1}B are nilpotent, their eigenvalues are all zero and hence, by the Cayley-Hamilton theorem, (A+c_iB)^n=0, for all 1 \leq i \leq n+1. Let x be an indeterminate. So the equation

(A+xB)^n = 0 \ \ \ \ \ \ \ \ \ \ \ \ (1)

has at least n+1 roots x =c_1, \cdots , c_{n+1} in k. We will show that this is not possible unless A^n=B^n=0, which will complete the solution.  To do so, let’s expand the left hand side in (1) to get

B^nx^n + D_1x^{n-1} + \cdots + D_{n-1}x + A^n = 0, \ \ \ \ \ \ \ \ \ \ \ \ \ (2),

where each D_i is in the k-algebra generated by A and B. Let 1 \leq i,j \leq n and let a_{ij} and b_{ij} be the (i,j)-entries of A^n and B^n respectively. Then the (i,j)-entry of the matrix on the left hand side of (2) is

b_{ij}x^n + d_1x^{n-1} + \cdots + d_{n-1}x + a_{ij}=0, \ \ \ \ \ \ \ \ \ \ \ \ \ (3)

where d_1, \cdots , d_{n-1} are the (i,j)-entries of D_1, \cdots , D_{n-1}. So (3) is telling us that the polynomial p(x) = b_{ij}x^n + d_1x^{n-1} + \cdots + d_{n-1}x + a_{ij} \in k[x], which has degree at most n, has at least n+1 roots in k. This is not possible unless p(x)=0. Thus a_{ij}=b_{ij}=0. \ \Box