Posts Tagged ‘block diagonal matrix’

If A is a square matrix with entries from a field of characteristic zero such that the trace of A^m is zero for all positive integers m, then A is nilpotent. This is a very well-known result in linear algebra and there are at least two well-known proofs of that (see here for example) that use the Vandermonde determinant and Newton identities. In this post, I’m going to give a different proof. Since I have not seen this proof anywhere, I think I can claim it’s mine! 🙂

Let F be a field, and let M_n(F) be the ring of n \times n matrices with entries from F. For A \in M_n(F), let \text{tr}(A) denote the trace of A. If A is nilpotent, then clearly A^m is also nilpotent for all positive integers m and hence \text{tr}(A^m)=0. The well-known fact we are now going to prove is that the converse is also true if the characteristic of F is zero.

Theorem. Let F be a field of characteristic zero, and let A \in M_n(F) such that \text{tr}(A^m)=0 for all integers m \ge 1. Then A is nilpotent.

Proof (Y. Sharifi). As we showed here as an application of Fitting’s lemma, there exists an integer 0 \le k \le n, an invertible matrix Y \in M_k(F) and a nilpotent matrix Z \in M_{n-k}(F) such that A is similar to a block diagonal matrix B=\begin{pmatrix}Y & 0 \\ 0 & Z\end{pmatrix}. If k=0, then B, hence A, is nilpotent and we are done. We now suppose that k \ge 1 and show that this case is impossible. For any positive integer m, we have

0=\text{tr}(A^m)=\text{tr}(B^m)=\text{tr}(Y^m)+\text{tr}(Z^m)=\text{tr}(Y^m). \ \ \ \ \ \ \ \ (*)

Let p(t)=t^k+c_{k-1}t^{k-1}+ \cdots + c_1t+c_0, \ c_i \in F, be the characteristic polynomial of Y. Note that since Y is invertible, c_0 \ne 0. By Cayley-Hamilton, Y^k+c_{k-1}Y^{k-1} + \cdots + c_1Y+c_0I=0, where I is the k \times k identity matrix. Thus, taking the trace of both sides and using (*), gives kc_0=0, which is not possible since c_0 \ne 0 and F has characteristic zero. \ \Box

Remark 1. The Theorem also holds true if F has positive characteristic p > n. To see that, look at the last sentence in the proof of the Theorem again. We got kc_0=0, \ c_0 \ne 0, which implies k1_F=0 and so p \mid k, which is not possible since k \le n < p.

Remark 2. The Theorem does not necessarily hold if F has positive characteristic p \le n. For example, choose A \in M_2(\mathbb{Z}_2) to be the identity matrix. Then \text{tr}(A^m)=0 for all positive integers m but A is not nilpotent.

The Theorem has many applications; let me give you one of them here.

The following problem was posted on the Art of Problem Solving a few weeks ago; you can see the problem and my solution (post #4) here. The proposer assumes that matrices have real entries but that is not necessary; the entries can come from any field of characteristic zero F. For any A,B \in M_n(F), we denote by [A,B] the additive commutator of A,B, i.e. [A,B]=AB-BA. Clearly [\ , \ ] is bilinear and \text{tr}([A,B])=0.

Problem (V. Brayman). Let F be a field of characteristic zero, and let \{A,B_k, \ k \ge 1\} \subset M_n(F) be such that [B_i,B_j]=A^{j-i} for all j > i \ge 1. Prove that A=0.

Solution (Y. Sharifi). First notice that for any positive integer k,

\text{tr}(A^k)=\text{tr}([B_1,B_{k+1}])=0,

and so, by the Theorem, A is nilpotent. Let m be the smallest positive integer such that A^m=0.
If m=1, we are done. Suppose now that m \ge 2. Since \dim_F M_n(F)=n^2 < \infty, the set \{B_k, \ k \ge 1\} is F-linearly dependent and so there exist an integer p \ge 2 and c_i \in F such that B_p=\sum_{i=1}^{p-1}c_iB_i. But then

\displaystyle \begin{aligned}A^{m-1}=[B_p,B_{m+p-1}]=[\sum_{i=1}^{p-1}c_iB_i,B_{m+p-1}]=\sum_{i=1}^{p-1}c_i[B_i,B_{m+p-1}]=\sum_{i=1}^{p-1}c_iA^{m+p-1-i}=0,\end{aligned}

because m+p-1-i \ge m for all 1 \le i \le p-1. But A^{m-1}=0 contradicts the minimality of m. So we can’t have m \ge 2. \ \Box