As we defined here, given an integer n, we say that a group G is n-abelian if (xy)^n=x^ny^n for all x,y \in G. Here we showed that if G/Z(G), where Z(G) is the center of G, is abelian and if |Z(G)|=n is odd, then G is n-abelian. We now prove a much more interesting result.

Proposition. Let G be a group with the center Z(G). If |G/Z(G)|=n, then G is n-aberlian.

Proof. Let \{g_1, g_2, \cdots , g_n\} be a transversal of Z(G) in G, as defined in this post. Let x \in G, and let \alpha \in S_n, h_i \in Z(G), \ 1 \le i \le n, be such that xg_i=g_{\alpha(i)}h_i. Then

x=g_{\alpha(i)}g_i^{-1}h_i. \ \ \ \ \ \ \ \ \ \ \ \ (1)

Let (i \ \alpha(i) \ \alpha^2(i) \ \cdots \ \alpha^{k-1}(i)) be any cycle in the decomposition of \alpha into disjoint cycles. Then, by (1),

x^k=g_{\alpha(i)}g_i^{-1}g_{\alpha^k(i)}g_{\alpha^{k-1}(i)}^{-1}g_{\alpha^{k-1}(i)}g_{\alpha^{k-2}(i)}^{-1} \cdots g_{\alpha^2(i)}g_{\alpha(i)}^{-1}h_ih_{\alpha(i)} \cdots h_{\alpha^{k-1}(i)}

and so since \alpha^k(i)=i, we get that

x^k=h_ih_{\alpha(i)} \cdots h_{\alpha^{k-1}(i)}. \ \ \ \ \ \ \ \ \ \ \ \ (2)

If \alpha is the product of m disjoint cycles, then we will have m identities of the form (2). Multiplying all the m identities together gives x^n=h_1h_2 \cdots h_n. On the other hand, by the Theorem in the post linked at the beginning of the proof, the map \lambda : G \to Z(G) defined by \lambda(x)=h_1h_2 \cdots h_n is a group homomorphism. Hence \lambda(x)=x^n is a group homomorphism and so, for all x,y \in G,

x^ny^n=\lambda(x)\lambda(y)=\lambda(xy)=(xy)^n,

proving that G is n-abelian. \ \Box

Note. The Proposition is Lemma 6.22 in T. Y. Lam’s book A First Course in Noncommutative Rings. However, Lam does not give a proof of \lambda(x)=x^n; the proof was added by me.

For a division ring D, we denote by Z(D) and D^{\times} the center and the multiplicative group of D, respectively.

Let D_1 be a division ring, and suppose that D_2 is a proper subdivision ring of D_1, i.e. D_2 is a subring of D_1, \ D_2 \ne D_1, and D_2 itself is a division ring. Then D_2^{\times} is clearly a proper subgroup of D_1^{\times}. Now, one may ask: when exactly is D_2^{\times} a normal subgroup of D_1^{\times} ? The Cartan-Brauer-Hua theorem gives the answer: D_2^{\times} is a normal subgroup of D_1^{\times} if and only if D_2 \subseteq Z(D_1). One side of the theorem is trivial: if D_2 \subseteq Z(D_1), then D_2^{\times} is obviously normal in D_1^{\times}. The other side of the theorem is not trivial, but it’s not hard to prove either. The proof is a quick result of the following simple yet strangely significant identity!

Hua’s Identity (Loo Keng Hua, 1949). Let D be a division ring, and let a,b \in D such that ab \ne ba. Then

a=(b^{-1}-(a-1)^{-1}b^{-1}(a-1))(a^{-1}b^{-1}a-(a-1)^{-1}b^{-1}(a-1))^{-1}.

Proof. First see that since ab \ne ba, the four elements a,a-1,b, and a^{-1}b^{-1}a-(a-1)^{-1}b^{-1}(a-1) are all nonzero hence invertible. Now,

a(a^{-1}b^{-1}a-(a-1)^{-1}b^{-1}(a-1))=b^{-1}a-a(a-1)^{-1}b^{-1}(a-1)

=b^{-1}a-(a-1+1)(a-1)^{-1}b^{-1}(a-1)=b^{-1}a-b^{-1}(a-1)-(a-1)^{-1}b^{-1}(a-1)

=b^{-1}-(a-1)^{-1}b^{-1}(a-1). \ \Box

Cartan-Brauer-Hua Theorem. Let D_1 be a division ring, and let D_2 be a proper subdivision ring of D_1. If D_2^{\times} is normal in D_1^{\times}, then D_2 \subseteq Z(D_1).

Proof. Suppose, to the contrary, that D_2 \nsubseteq Z(D_1). Let a \in D_1, b \in D_2 such that ab \ne ba. Then, since D_2^{\times} is a normal subgroup of D_1^{\times}, all the elements

b^{-1}, \ (a-1)^{-1}b^{-1}(a-1), \ a^{-1}b^{-1}a, \ (a-1)^{-1}b^{-1}(a-1)

are in D_2^{\times} and hence, by Hua’s identity, a \in D_2. So every element of D_1 \setminus D_2 commutes with b. Now let c \in D_1 \setminus D_2. Then ac \in D_1 \setminus D_2, because a \in D_2, and therefore both c,ac commute with b. But then a=acc^{-1} will also commute with b and that’s a contradiction. \ \Box

Note. There are other proofs of the Theorem; for example, here is a simple but not very well-known one.

Throughout this post, k is a field and k^{\times}:=k \setminus \{0\}, the multiplicative group of k.

We have already seen the general linear group \text{GL}(n,k) and the special linear group \text{SL}(n,k) in this blog several times. The general linear group \text{GL}(n,k) is the (multiplicative) group of all n \times n invertible matrices with entries from k, and the special linear group \text{SL}(n,k) is the subgroup of \text{GL}(n,k) consisting of all matrices with determinant 1.

Since the determinant is multiplicative, the map f: \text{GL}(n,k) \to k^{\times} defined by f(A)=\det A is an onto group homomorphism and \ker f = \text{SL}(n,k). In particular, \text{SL}(n,k) is a normal subgroup of \text{GL}(n,k).

Question. Is f=\det the only group homomorphism \text{GL}(n,k) \to k^{\times}?

Answer. No. For example, the map g: \text{GL}(n,\mathbb{C}) \to \mathbb{C}^{\times} defined by g(A)=\overline{\det A}, where \overline{\det A} is the complex conjugate of \det A, is clearly a group homomorphism. In general, if h: k^{\times} \to k^{\times} is any group homomorphism, then hf: \text{GL}(n,k) \to k^{\times} is also a group homomorphism. In this post, we show that there are no other group homomorphisms \text{GL}(n,k) \to k^{\times}. But first, we need a useful little result from basic group theory.

Lemma. Let G_1,G_2 be groups, and let f,g: G_1 \to G_2 be group homomorphisms. If f is onto and \ker f \subseteq \ker g, then there exists a group homomorphism h: G_2 \to G_2 such that g=hf.

Proof. You can prove that directly by showing that the map h defined by h(f(x))=g(x) is a well-defined group homomorphism from G_2 to G_2. Here is another way. Let K_1:=\ker f and K_2:=\ker g. Since f is onto, the map \alpha : G_1/K_1 \to G_2 defined by \alpha(xK_1)=f(x), \ x \in G_1, is a group isomorphism. Since K_1 \subseteq K_2, the map \beta : G_1/K_1 \to G_1/K_2 defined by \beta(xK_1)=xK_2, \ x \in G_1, is a well-defined group homomorphism. Finally, we have the injective group homomorphism \gamma : G_1/K_2 \to G_2 defined by \gamma(xK_2)=g(x), \ x \in G_1. Now, let h:=\gamma \beta \alpha^{-1}. Then h: G_2 \to G_2 is a group homomorphism and for all x \in G_1,

hf(x)=\gamma \beta \alpha^{-1}(f(x))=\gamma \beta(xK_1)=\gamma(xK_2)=g(x). \ \Box

Theorem. A map g: \text{GL}(n,k) \to k^{\times} is a group homomorphism if and only if g=hf, where f: \text{GL}(n,k) \to k^{\times} is defined by f(A)=\det A, \ A \in \text{GL}(n,k), and h: k^{\times} \to k^{\times} is any group homomorphism.

Proof. If h: k^{\times} \to k^{\times} is a group homomorphism, then obviously g=hf: \text{GL}(n,k) \to k^{\times} is a group homomorphism. Conversely, suppose now that g: \text{GL}(n,k) \to k^{\times} is a group homomorphism. We consider two cases.

Case 1: |k|=2. In this case, k^{\times}=(1) and so the only group homomorphism \text{GL}(n,k) \to k^{\times} or k^{\times} \to k^{\times} is the trivial one. Thus f,g,h in this case are all trivial maps.

Case 2: |k| > 2. Let A,B \in \text{GL}(n,k). Then, since the image of g is an abelian group,

g(ABA^{-1}B^{-1})=g(A)g(B)g(A^{-1})g(B^{-1})=g(A)(g(A))^{-1}g(B)(g(B))^{-1}=1

and so ABA^{-1}B^{-1} \in \ker g. Therefore the commutator subgroup of \text{GL}(n,k) is contained in \ker g. On the other hand, by this post, the commutator subgroup of \text{GL}(n,k) is \text{SL}(n,k)=\ker f. So \ker f \subseteq \ker g and the result now follows from the Lemma. \ \Box

Note. For some variations and generalizations of the Theorem, see this paper.

If A is a square matrix with entries from a field of characteristic zero such that the trace of A^m is zero for all positive integers m, then A is nilpotent. This is a very well-known result in linear algebra and there are at least two well-known proofs of that (see here for example) that use the Vandermonde determinant and Newton identities. In this post, I’m going to give a different proof. Since I have not seen this proof anywhere, I think I can claim it’s mine! 🙂

Let F be a field, and let M_n(F) be the ring of n \times n matrices with entries from F. For A \in M_n(F), let \text{tr}(A) denote the trace of A. If A is nilpotent, then clearly A^m is also nilpotent for all positive integers m and hence \text{tr}(A^m)=0. The well-known fact we are now going to prove is that the converse is also true if the characteristic of F is zero.

Theorem. Let F be a field of characteristic zero, and let A \in M_n(F) such that \text{tr}(A^m)=0 for all integers m \ge 1. Then A is nilpotent.

Proof (Y. Sharifi). As we showed here as an application of Fitting’s lemma, there exists an integer 0 \le k \le n, an invertible matrix Y \in M_k(F) and a nilpotent matrix Z \in M_{n-k}(F) such that A is similar to a block diagonal matrix B=\begin{pmatrix}Y & 0 \\ 0 & Z\end{pmatrix}. If k=0, then B, hence A, is nilpotent and we are done. We now suppose that k \ge 1 and show that this case is impossible. For any positive integer m, we have

0=\text{tr}(A^m)=\text{tr}(B^m)=\text{tr}(Y^m)+\text{tr}(Z^m)=\text{tr}(Y^m). \ \ \ \ \ \ \ \ (*)

Let p(t)=t^k+c_{k-1}t^{k-1}+ \cdots + c_1t+c_0, \ c_i \in F, be the characteristic polynomial of Y. Note that since Y is invertible, c_0 \ne 0. By Cayley-Hamilton, Y^k+c_{k-1}Y^{k-1} + \cdots + c_1Y+c_0I=0, where I is the k \times k identity matrix. Thus, taking the trace of both sides and using (*), gives kc_0=0, which is not possible since c_0 \ne 0 and F has characteristic zero. \ \Box

Remark 1. The Theorem also holds true if F has positive characteristic p > n. To see that, look at the last sentence in the proof of the Theorem again. We got kc_0=0, \ c_0 \ne 0, which implies k1_F=0 and so p \mid k, which is not possible since k \le n < p.

Remark 2. The Theorem does not necessarily hold if F has positive characteristic p \le n. For example, choose A \in M_2(\mathbb{Z}_2) to be the identity matrix. Then \text{tr}(A^m)=0 for all positive integers m but A is not nilpotent.

The Theorem has many applications; let me give you one of them here.

The following problem was posted on the Art of Problem Solving a few weeks ago; you can see the problem and my solution (post #4) here. The proposer assumes that matrices have real entries but that is not necessary; the entries can come from any field of characteristic zero F. For any A,B \in M_n(F), we denote by [A,B] the additive commutator of A,B, i.e. [A,B]=AB-BA. Clearly [\ , \ ] is bilinear and \text{tr}([A,B])=0.

Problem (V. Brayman). Let F be a field of characteristic zero, and let \{A,B_k, \ k \ge 1\} \subset M_n(F) be such that [B_i,B_j]=A^{j-i} for all j > i \ge 1. Prove that A=0.

Solution (Y. Sharifi). First notice that for any positive integer k,

\text{tr}(A^k)=\text{tr}([B_1,B_{k+1}])=0,

and so, by the Theorem, A is nilpotent. Let m be the smallest positive integer such that A^m=0.
If m=1, we are done. Suppose now that m \ge 2. Since \dim_F M_n(F)=n^2 < \infty, the set \{B_k, \ k \ge 1\} is F-linearly dependent and so there exist an integer p \ge 2 and c_i \in F such that B_p=\sum_{i=1}^{p-1}c_iB_i. But then

\displaystyle \begin{aligned}A^{m-1}=[B_p,B_{m+p-1}]=[\sum_{i=1}^{p-1}c_iB_i,B_{m+p-1}]=\sum_{i=1}^{p-1}c_i[B_i,B_{m+p-1}]=\sum_{i=1}^{p-1}c_iA^{m+p-1-i}=0,\end{aligned}

because m+p-1-i \ge m for all 1 \le i \le p-1. But A^{m-1}=0 contradicts the minimality of m. So we can’t have m \ge 2. \ \Box

We remarked here that the composition of derivations of a ring need not be a derivation. In this post, we prove this simple yet interesting result that if R is a prime ring of characteristic \ne 2 and if \delta_1,\delta_2 are nonzero derivations of R, then \delta_1\delta_2 can never be a derivation of R. But before getting into the proof of that, let me remind the reader of a little fact that tends to bug many students.

nTorsion Free vs Characteristic \ne n. Let R be a ring, and n \ge 2 an integer. Recall that we say that R is ntorsion free if a \in R, na=0 implies a=0. We say that \text{char}(R)=n if n is the smallest positive integer such that na=0 for all a \in R. It is clear that if R is n-torsion free, then \text{char}(R) \ne n. The simple point I’d like to make here is that the converse is not always true. That will make a lot more sense if you look at its contrapositive, in fact the contrapositive of a stronger statement: na=0 for some 0 \ne a \in R does not always imply that nR=(0). However, the converse is true if R is prime. First, an example to show that the converse in not always true.

Example 1. Consider the ring R=\mathbb{Z}_n \oplus \mathbb{Z}, and a=(1,0) \in R. Then a \ne 0 and na=0. So R is not n-torsion free. but \text{char}(R)=0 \ne n.

Now let’s show that the converse is true if R is prime.

Example 2. Let n \ge 2 be an integer and R a prime ring. Suppose that na=0 for some 0 \ne a \in R. Then nR=(0), i.e. nr=0 for all r \in R.

Proof. We have (0)=(na)Rr=aR(nr) and so, since a \ne 0 and R is prime, nr=0. \ \Box

Let’s now get to the subject of this post.

Lemma 1. Let R be a ring, and let \delta_1,\delta_2 be derivations of R. Then \delta_1\delta_2 is a derivation of R if and only if

\delta_1(a)b\delta_2(c)+\delta_2(a)b\delta_1(c)=0,

for all a,b,c \in R.

Proof. Since \delta_1\delta_2 is clearly additive, it is a derivation if and only if it satisfies the product rule, i.e.

\delta_1\delta_2(bc)=\delta_1\delta_2(b)c+b\delta_1\delta_2(c). \ \ \ \ \ \ \ (1)

On the other hand, since \delta_1,\delta_2 are derivations of R, we also have

\delta_1\delta_2(bc)=\delta_1(\delta_2(b)c+b\delta_2(c))=\delta_1(\delta_2(b)c)+\delta_1(b\delta_2(c))

=\delta_1\delta_2(b)c+\delta_2(b)\delta_1(c)+\delta_1(b)\delta_2(c)+b\delta_1\delta_2(c). \ \ \ \ \ \ \ (2)

So we get from (1),(2) that

\delta_1(b)\delta_2(c)+\delta_2(b)\delta_1(c)=0. \ \ \ \ \ \ \ \ (3)

Replacing b by ab in (3) gives

0=\delta_1(ab)\delta_2(c)+\delta_2(ab)\delta_1(c)=(\delta_1(a)b+a\delta_1(b))\delta_2(c)+(\delta_2(a)b+a\delta_2(b))\delta_1(c)

=\delta_1(a)b\delta_2(c)+\delta_2(a)b\delta_1(c)+a(\delta_1(b)\delta_2(c)+\delta_2(b)\delta_1(c))

=\delta_1(a)b\delta_2(c)+\delta_2(a)b\delta_1(c), \ \ \ \ \ \ \ \text{by} \ (3). \ \Box

Corollary. Let R be a 2-torsion free semiprime ring, and let \delta be a derivation of R. Then \delta^2 is a derivation of R if and only if \delta=0.

Proof. Suppose that \delta^2 is a derivation of R and let a \in R. Then choosing \delta_1=\delta_2=\delta and c=a in Lemma 1 gives 2\delta(a)b\delta(a)=0, for all b \in R. So, since R is 2-torsion free, \delta(a)b\delta(a)=0, for all b \in R. Thus \delta(a)R\delta(a)=(0) and hence \delta(a)=0, because R is semiprime. \ \Box

Lemma 2. Let R be a prime ring, and let \delta be a derivation of R such that \delta(a)b=0 for all a \in R and some b \in R. Then either b=0 or \delta=0.

Proof. Since \delta(a)b=0 for all a \in R, we have \delta(ca)b=0 for all a,c \in R and so

0=\delta(ca)b=(\delta(c)a+c\delta(a))b=\delta(c)ab+c\delta(a)b=\delta(c)ab.

So \delta(c)Rb=(0) for all c \in R and hence, since R is prime, either b=0 or \delta(c)=0 for all c \in R. \ \Box

Remark. Lemma 2 remains true if we replace the condition \delta(a)b=0 by b\delta(a)=0. The proof is similar, just this time replace a by ac.

Theorem (Edward C. Posner, 1957). Let R be a prime ring of characteristic \ne 2, and let \delta_1, \delta_2 be derivations of R. Then \delta_1\delta_2 is a derivation of R if and only if \delta_1=0 or \delta_2=0.

Proof. First note that, by Example 2, the condition \text{char}(R) \ne 2 is the same as saying that R is 2-torsion free. Now, suppose that \delta_1\delta_2 is a derivation of R and let a,b,c \in R. Applying Lemma 1 to a, \delta_2(c)b, c gives

\delta_1(a)\delta_2(c)b\delta_2(c)+\delta_2(a)\delta_2(c)b\delta_1(c)=0.

But by the identity (3) in Lemma 1, \delta_1(a)\delta_2(c)=-\delta_2(a)\delta_1(c) and so the above becomes

\delta_2(a)(\delta_2(c)b\delta_1(c)-\delta_1(c)b\delta_2(c))=0.

Thus, by Lemma 2, either \delta_2=0 or \delta_2(c)b\delta_1(c)-\delta_1(c)b\delta_2(c)=0. If \delta_2=0, we are done. So suppose that

\delta_2(c)b\delta_1(c)-\delta_1(c)b\delta_2(c)=0.

Adding the above identity to the identity \delta_1(c)b\delta_2(c)+\delta_2(c)b\delta_1(c)=0, which holds by Lemma 1, gives 2\delta_2(c)b\delta_1(c)=0. Hence \delta_2(c)b\delta_1(c)=0 and so \delta_2(c)R\delta_1(c)=(0). Thus, since R is prime, either \delta_1=0 or \delta_2=0. \ \Box

Example 3. The condition \text{char}(R) \ne 2 cannot be removed from the Theorem. Consider the polynomial ring R:=\mathbb{Z}_2[x], and the derivation \delta:=\frac{d}{dx}. Then \delta \ne 0 but \delta^2=0 is a derivation.

Note. Examples and the Corollary in this post are mine. I have also slightly simplified Posner’s proof of the Theorem.

All rings in this post are commutative with identity. For the basics on derivations of rings see this post and this post.

Let \delta be a derivation of a ring R. Since \delta is additive, \delta(na)=n\delta(a) for all integers n and all a \in R. So every derivation of a ring is a \mathbb{Z}-derivation. If R is \mathbb{Q}-algebra and r=\frac{m}{n} \in \mathbb{Q}, where m,n are integers and n \ne 0, then n\delta(ra)=\delta(nra)=\delta(ma)=m\delta(a) and so \delta(ra)=r\delta(a). Thus every derivation of a \mathbb{Q}-algebra is a \mathbb{Q}-derivation.

Definition. We say that a derivation \delta of a ring R is locally nilpotent if for every a \in R, there exists a positive integer n such that \delta^n(a)=0.

Example. Let R be the polynomial ring \mathbb{C}[x]. Then the derivation \delta:=\frac{d}{dx} is locally nilpotent because if p(x) \in R has degree n, then \delta^{n+1}(p(x))=0.

The following Theorem characterizes all \mathbb{Q}-algebras R for which there exists a locally nilpotent derivation \delta and a \in R such that \delta(a)=1. The polynomial ring in the above Example gives one of those algebras since, in that example, \delta(x)=1. It turns out that any such algebra is a polynomial ring!

Theorem. Let \delta be a locally nilpotent derivation of a \mathbb{Q}-algebra R, and let K:=\ker \delta. If there exists x \in R such that \delta(x)=1, then x is transcendental over K and R=K[x].

Proof. Suppose, to the contrary, that x is algebraic over K, and let n be the smallest positive integer such that \alpha_nx^n+\alpha_{n-1}x^{n-1}+ \cdots + \alpha_1x+\alpha_0=0 for some \alpha_i \in K, \ \alpha_n \ne 0. Then, since by the product rule, \delta(x^i)=ix^{i-1}\delta(x)=ix^{i-1}, we have

0=\delta( \alpha_nx^n+\alpha_{n-1}x^{n-1}+ \cdots + \alpha_1x+\alpha_0)= \alpha_n\delta(x^n)+\alpha_{n-1}\delta(x^{n-1})+ \cdots + \alpha_1\delta(x)

=n\alpha_n x^{n-1}+(n-1)\alpha_{n-1}x^{n-1}+ \cdots + \alpha_1,

which contradicts the minimality of n. So x is transcendental over K. We now show that R=K[x]. For a \in R, let \nu(a) be the smallest positive integer m such that \delta^m(a)=0. The proof is by induction over \nu(a), \ a \in R. If \nu(a)=1, then \delta(a)=0 and so a \in K \subset K[x]. Suppose now that a \in R and m:=\nu(a) \ge 2. Let

\displaystyle y:=\sum_{n=0}^{m-1}\frac{(-1)^n\delta^n(a)x^n}{n!}=a-bx,

where

\displaystyle b=\sum_{n=1}^{m-1}\frac{(-1)^{n-1}\delta^n(a)x^{n-1}}{n!}.

So a=y+bx and so we are done if we prove that b,y \in K[x].

Claim 1: y \in K.

Proof. We have

\displaystyle \delta(y)=\sum_{n=0}^{m-1}\frac{(-1)^n}{n!}\delta(\delta^n(a)x^n)=\sum_{n=0}^{m-1}\frac{(-1)^n}{n!}(\delta^{n+1}(a)x^n+\delta^n(a)\delta(x^n))

\displaystyle =\sum_{n=0}^{m-1}\frac{(-1)^n}{n!}(\delta^{n+1}(a)x^n+n\delta^n(a)x^{n-1})=\sum_{n=0}^{m-1}\frac{(-1)^n\delta^{n+1}(a)x^n}{n!}+\sum_{n=1}^{m-1}\frac{(-1)^n\delta^n(a)x^{n-1}}{(n-1)!}

\displaystyle =\sum_{n=0}^{m-2}\frac{(-1)^n\delta^{n+1}(a)x^n}{n!}+\sum_{n=0}^{m-2}\frac{(-1)^{n+1}\delta^{n+1}(a)x^n}{n!}=0

and so y \in K.

Claim 2: b \in K[x].

Proof. By the Leibniz formula,

\displaystyle \delta^{m-1}(b)=\sum_{n=1}^{m-1}\frac{(-1)^{n-1}}{n!}\delta^{m-1}(\delta^n(a)x^{n-1})=\sum_{n=1}^{m-1}\frac{(-1)^{n-1}}{n!}\sum_{k=0}^{m-1}\binom{m-1}{k}\delta^{n+k}(a)\delta^{m-k-1}(x^{n-1}).

Now notice that \delta^{n+k}(a)\delta^{m-k-1}(x^{n-1})=0 for all k,n because if n+k \ge m, then \delta^{n+k}(a)=0, and if n+k < m, then \delta^{m-k-1}(x^{n-1})=0. Thus \delta^{m-1}(b)=0 and so \nu(b) < \nu(a)=m. Hence, by our induction hypothesis, b \in K[x]. \ \Box

Exercise. Let \delta be a locally nilpotent derivation of a \mathbb{Q}-algebra R, and let c \in R. Let K:=\ker \delta, and define the map f: R \to R by

\displaystyle f(a):=\sum_{n=0}^{\infty}\frac{\delta^n(a)c^n}{n!},

for all a \in R. Show that f is a K-algebra homomorphism.

Note. The above Theorem is Theorem 2.8 in here. The proof I’ve given is essentially the same as the proof given in there; I just made the proof easier to follow.

See the first part of this post here. All rings in this post are assumed to have identity.

In the first part, we gave two examples of derivations on rings. We now give a couple of ways to make new derivations using given ones.

Example 1. Let R be a ring, and let c_1,c_2 \in Z(R), the center of R. If \delta_1,\delta_2 \in \text{Der}(R), then

i) c_1\delta_1+c_2\delta_2 \in \text{Der}(R),

ii) \delta_1\delta_2-\delta_2\delta_1 \in \text{Der}(R).

Proof. The first part is quite straightforward and so I’m going to leave it as an easy exercise. For the second part, we need to show that \delta_1\delta_2-\delta_2\delta_1 is additive and satisfies the product rule. So let a,b \in R. Then

\begin{aligned} (\delta_1\delta_2-\delta_2\delta_1)(a+b)=\delta_1\delta_2(a+b)-\delta_2\delta_1(a+b)= \delta_1(\delta_2(a)+\delta_2(b))-\delta_2(\delta_1(a)+\delta_1(b))\end{aligned}

=\delta_1\delta_2(a)+\delta_1\delta_2(b)-\delta_2\delta_1(a)-\delta_2\delta_1(b)=(\delta_1\delta_2-\delta_2\delta_1)(a)+(\delta_1\delta_2-\delta_2\delta_1)(b),

which proves that \delta_1\delta_2-\delta_2\delta_1 is additive. Now, for the product rule, we have

\begin{aligned}(\delta_1\delta_2-\delta_2\delta_1)(ab)=\delta_1(\delta_2(ab))-\delta_2(\delta_1(ab))=\delta_1(\delta_2(a)b+a\delta_2(b))-\delta_2(\delta_1(a)b+a\delta_1(b))\end{aligned}

=\delta_1(\delta_2(a)b)+\delta_1(a\delta_2(b))-\delta_2(\delta_1(a)b)-\delta_2(a\delta_1(b))

\begin{aligned} =\delta_1\delta_2(a)b+\delta_2(a)\delta_1(b)+\delta_1(a)\delta_2(b)+a\delta_1\delta_2(b)-\delta_2\delta_1(a)b-\delta_1(a)\delta_2(b)-\delta_2(a)\delta_1(b)-a\delta_2\delta_1(b)\end{aligned}

=\delta_1\delta_2(a)b+a\delta_1\delta_2(b)-\delta_2\delta_1(a)b-a\delta_2\delta_1(b)=(\delta_1\delta_2-\delta_2\delta_1)(a)b+a(\delta_1\delta_2-\delta_2\delta_1)(b),

proving that \delta_1\delta_2-\delta_2\delta_1 satisfies the product rule. \ \Box

Remark. By Example 1, ii), the commutator of two derivations is a derivation. However, the composition of two derivations need not be a derivation. For example, let R be the polynomial ring \mathbb{C}[x] and consider the derivation \delta:=\frac{d}{dx}. Then the second derivative, i.e. \delta^2, is not a derivation because if it was, then we would have \delta^2(x^2)=2x\delta^2(x)=0 but we know that \delta^2(x^2)=2.

Example 2. Let M_n(R) be the ring of n \times n matrices with entries from a ring R. Let \delta \in \text{Der}(R). Define the map \Delta: M_n(R) \to M_n(R) as follows: for any A=[a_{ij}] \in M_n(R), where a_{ij} is the (i,j)-entry of A, define \Delta(A)=[\delta(a_{ij})]. Then \Delta \in \text{Der}(M_n(R)).

Proof. Since \delta is additive, \Delta is clearly additive too. So we only need to show that \Delta satisfies the product rule. Let A=[a_{ij}], \ B=[b_{ij}] \in M_n(R). Let c_{ij},d_{ij} be, respectively, the (i,j)-entry of AB and the (i,j)-entry of \Delta(AB). Then c_{ij}=\sum_{k=1}^na_{ik}b_{kj} and so

\displaystyle \begin{aligned}  d_{ij}=\delta(c_{ij})=\sum_{k=1}^n\delta(a_{ik}b_{kj})=\sum_{k=1}^n(\delta(a_{ik})b_{kj}+a_{ik}\delta(b_{kj}))=\sum_{k=1}^n\delta(a_{ik})b_{kj}+\sum_{k=1}^na_{ik}\delta(b_{kj})\end{aligned},

which is the (i,j)-entry of \Delta(A)B+A\Delta(B). So \Delta(AB)= \Delta(A)B+A\Delta(B) hence the result. \ \Box

Next is to prove a familiar formula from Calculus, i.e. the Leibniz formula for the nth derivative of the product of two elements.

The Leibniz Formula. Let R be a ring, a,b \in R, and \delta \in \text{Der}(R). Then

\displaystyle \delta^n(ab)=\sum_{k=0}^n\binom{n}{k}\delta^k(a)\delta^{n-k}(b),

for all integers n \ge 0. Here \delta^0 is defined to be the identity map.

Proof. The proof is by induction over n. There is nothing to prove for n=0. Now, assuming that the formula holds for n, we have

\displaystyle \delta^{n+1}(ab)=\delta(\delta^n(ab))=\delta \left( \sum_{k=0}^n\binom{n}{k}\delta^k(a)\delta^{n-k}(b)\right)=\sum_{k=0}^n\binom{n}{k}\delta(\delta^k(a)\delta^{n-k}(b))

\displaystyle =\sum_{k=0}^n\binom{n}{k}(\delta^{k+1}(a)\delta^{n-k}(b)+\delta^k(a)\delta^{n-k+1}(b)), \ \ \ \ \ \ \text{by the product rule}

\displaystyle =\sum_{k=0}^n\binom{n}{k}\delta^{k+1}(a)\delta^{n-k}(b)+\sum_{k=0}^n\binom{n}{k}\delta^k(a)\delta^{n-k+1}(b)

\displaystyle =\sum_{k=1}^{n+1}\binom{n}{k-1}\delta^k(a)\delta^{n-k+1}(b)+\sum_{k=0}^n\binom{n}{k}\delta^k(a)\delta^{n-k+1}(b)

\displaystyle =\sum_{k=1}^{n+1}\binom{n}{k-1}\delta^k(a)\delta^{n-k+1}(b)+\sum_{k=1}^{n+1}\binom{n}{k}\delta^k(a)\delta^{n-k+1}(b)+a\delta^{n+1}(b)

\displaystyle =\sum_{k=1}^{n+1}\left[\binom{n}{k-1}+\binom{n}{k}\right]\delta^k(a)\delta^{n-k+1}(b)+a\delta^{n+1}(b)

\displaystyle =\sum_{k=1}^{n+1}\binom{n+1}{k}\delta^k(a)\delta^{n-k+1}(b)+a\delta^{n+1}(b)=\sum_{k=0}^{n+1}\binom{n+1}{k}\delta^k(a)\delta^{n-k+1}(b),

which proves the formula for n+1. \ \Box

Let me end this post with an example which is Remark 1.6.30 in Louis Rowen’s book Ring Theory (volume 1).

Example 3. Let R be a ring, and \delta \in \text{Der}(R). In the first part of this post, we showed that \ker \delta is a subring of R. Now, for any integer n \ge 0, let

K_n:=\ker \delta^n=\{a \in R: \ \delta^{n+1}(a)=0\}.

So \ker \delta =K_0. Then K:=\bigcup_{k=0}^{\infty}K_n is a subring of R and K_mK_n \subseteq K_{m+n}, for all m,n.

Proof. It is clear that (K_n,+) is an abelian group for all n, and K_0 \subseteq K_1 \subseteq \cdots. So (K,+) is an abelian group, and so we only need to show that K_mK_n \subseteq K_{m+n} for all m,n. Let a \in K_m and b \in K_n. We want to show that ab \in K_{m+n}. So \delta^{m+1}(a)=\delta^{n+1}(b)=0. We also have, by Leibniz formula,

\displaystyle \delta^{m+n+1}(ab)=\sum_{k=0}^{m+n+1}\binom{m+n+1}{k}\delta^k(a)\delta^{m+n+1-k}(b).

Now, if k \ge m+1, then \delta^k(a)=0 and if k \le m, then m+n+1-k \ge n+1 and therefore \delta^{m+n+1-k}(b)=0. Thus \delta^{m+n+1}(ab)=0 and so ab \in K_{m+n}. \ \Box

All rings in this post are assumed to have identity and Z(R) denotes the center of a ring R.

Those who have the patience to follow my posts have seen derivation on rings several times but they have not seen a post exclusively about the basics on derivations of rings because, strangely, that post didn’t exist until now.

In the polynomial ring \mathbb{C}[x], we have the familiar concept of differentiation with respect to x, i.e. \delta:= \frac{d}{dx}. This is a map form \mathbb{C}[x] \to \mathbb{C}[x] which is additive and satisfies the product rule. Also, \delta(c)=0 for all c \in \mathbb{C}. We now extend this concept to rings in general.

Definition 1. Let R be a ring. A derivation of R is any additive map \delta : R \to R that satisfies the product rule: \delta(ab)=\delta(a)b+a\delta(b), for all a,b \in R. We denote by \text{Der}(R) the set of all derivations of R.

We can now extend the familiar concept of differentiating polynomials to differentiating polynomials over any ring R. Note that the variable x is assumed to be in the center of R[x].

Example 1. Let R be a ring, and let R[x] be the ring of polynomials over R. Define the map \delta : R[x] \to R[x] by

\displaystyle \delta\left(\sum_{k=0}^na_kx^k\right)=\sum_{k=0}^nka_kx^{k-1}, \ \ \ \ \ n \in \mathbb{Z}_{\ge 0}, \ \ a_k \in R.

Then \delta \in \text{Der}(R[x]). The derivation \delta is usually denoted by \frac{d}{dx}. Note that, for the definition to make sense, the term ka_kx^{k-1} for k=0 is defined to be 0 and x^0 is defined to be 1.

Proof. We need to show that \delta is additive and also it satisfies the product rule. Let p(x), q(x) \in R[x]. So we can write p(x)=\sum_{k=0}^na_kx^k, \ q(x)=\sum_{k=0}^nb_kx^k. Then

\displaystyle \delta(p(x)+q(x))=\delta \left(\sum_{k=0}^n(a_k+b_k)x^k\right)=\sum_{k=0}^nk(a_k+b_k)x^{k-1}

\displaystyle =\sum_{k=0}^nka_kx^{k-1}+\sum_{k=0}^nkb_kx^{k-1}=\delta(p(x))+\delta(q(x)).

To prove the product rule, we have

\displaystyle \delta(p(x)q(x))=\delta \left(\sum_{k=0}^{2n}\sum_{j=0}^ka_jb_{k-j}x^k\right)=\sum_{k=0}^{2n}k\sum_{j=0}^ka_jb_{k-j}x^{k-1}, \ \ \ \ \ \ \ \ \ (*)

and

\displaystyle x(\delta(p(x))q(x)+p(x)\delta(q(x))=x\left(\sum_{k=0}^nka_kx^{k-1}\sum_{k=0}^nb_kx^k+\sum_{k=0}^na_kx^k\sum_{k=0}^nkb_kx^{k-1}\right)

\displaystyle \begin{aligned}=\sum_{k=0}^nka_kx^k\sum_{k=0}^nb_kx^k+\sum_{k=0}^na_kx^k\sum_{k=0}^nkb_kx^k=\sum_{k=0}^{2n}\sum_{j=0}^kja_jb_{k-j}x^k+\sum_{k=0}^{2n}\sum_{j=0}^k(k-j)a_jb_{k-j}x^k\end{aligned}

\displaystyle =\sum_{k=0}^{2n}k\sum_{j=0}^ka_jb_{k-j}x^k=x \delta(p(x)q(x)), \ \ \ \ \ \text{by} \ (*).

Therefore \delta(p(x)q(x))=\delta(p(x))q(x)+p(x)\delta(q(x)). \ \Box

Example 2. Let R be a ring, and let r \in R. Define the map \delta : R \to R by \delta(a)=ra-ar, for all a \in R. Then \delta \in \text{Der}(R) and it’s called an inner derivation.

Proof. We need to show that \delta is additive and also it satisfies the product rule. Let a,b \in R. Then

\delta(a+b)=r(a+b)-(a+b)r=ra-ar+rb-br=\delta(a)+\delta(b),

and

\begin{aligned} \delta(a)b+a\delta(b)=(ra-ar)b+a(rb-br)=rab-arb+arb-abr=rab-abr=\delta(ab). \ \Box \end{aligned}

Remarks. Let R be a ring, and let \delta \in \text{Der}(R). Let K:=\ker \delta=\{a \in R: \ \delta(a)=0\}.

i) \{0,1\} \subseteq K,

ii) K is a subring of R called the ring of constants of \delta,

iii) if a \in K is invertible in R, then a^{-1} \in K; so if R is a division ring, K is a division ring too,

iv) if r \in R, then \delta(ra)=r\delta(a) for all a \in R if and only if r \in K,

v) \delta(Z(R)) \subseteq Z(R),

vi) if \delta(a) \in Z(R), then \delta(a^n)=na^{n-1}\delta(a) for all integers n \ge 1, where, for n=1, we define a^0=1,

vii) The identity given in vi) may not hold true if \delta(a) \notin Z(R).

Proof. i) Since \delta is additive, \delta(0)=\delta(0+0)=\delta(0)+\delta(0)=2\delta(0), and so \delta(0)=0, hence 0 \in K. Also, by product rule, \delta(1)=\delta(1 \cdot 1)=\delta(1) \cdot 1 + 1 \cdot \delta(1)=2\delta(1), and so \delta(1)=0 hence 1 \in K.

ii) If a,b \in \ker \delta, then since \delta is additive, \delta(a+b)=\delta(a)+\delta(b)=0+0=0 and so a+b \in \ker \delta. Also, by product rule, \delta(ab)=\delta(a)b+a\delta(b)=0+0=0 and so ab \in \ker \delta.

iii) By i) and product rule, 0=\delta(1)=\delta(aa^{-1})=\delta(a)a^{-1}+a\delta(a^{-1})=a\delta(a^{-1}) and so \delta(a^{-1}) =0.

iv) If \delta(ra)=r\delta(a), for some r \in R and all a \in R, then a=1 gives \delta(r)=r\delta(1)=0 and so r \in K. Conversely, if r \in K, \ a \in R, then, by product rule, \delta(ra)=\delta(r)a+r\delta(a)=0+r\delta(a)=r\delta(a).

v) Let a \in Z(R), \ b \in R. Then

\delta(a)b+a\delta(b)=\delta(ab)=\delta(ba)=\delta(b)a+b\delta(a)=a\delta(b)+b\delta(a),

and so \delta(a)b=b\delta(a), proving that \delta(a) \in Z(R).

vi) First notice that, by v), the condition \delta(a) \in Z(R) is stronger than a \in Z(R). Now, the proof is by induction over n. It is clear for n=1. Assuming the identity holds for n, we have

\begin{aligned} \delta(a^{n+1})=\delta(a^na)=\delta(a^n)a+a^n\delta(a)=na^{n-1}\delta(a)a+a^n\delta(a)=na^n\delta(a)+a^n\delta(a)=(n+1)a^n\delta(a).\end{aligned}

vii) Consider R:=M_2(\mathbb{C}), the ring of 2 \times 2 matrices with complex entries. Let \delta be the inner derivation (see Example 2) of R corresponding to r=\begin{pmatrix}1 & 0 \\ 0 & 0\end{pmatrix}. Now, choose a=\begin{pmatrix}0 & 1\\ 1 & 0\end{pmatrix}. Then a^2 is the identity matrix and so \delta(a^2)=0. Also,

\delta(a)=ra-ar=\begin{pmatrix}0 & 1 \\ -1 & 0\end{pmatrix} \notin Z(R), \ \ \  2a\delta(a) =\begin{pmatrix}-2 & 0 \\ 0 & 2\end{pmatrix} \ne 0=\delta(a^2). \ \Box

Definition 2. Let R be a ring, \delta \in \text{Der}(R), and C a subring of R contained in the center of R. If C \subseteq \ker \delta, then, by Remark iv), \delta(ca+b)=c\delta(a)+\delta(b) for all c \in C,a,b \in R, i.e. \delta is C-linear. The set of all C-linear derivations of R is denoted by \text{Der}_C(R).

In part two of this post, we will learn more about derivations on rings. Happy \pi day!

Let M_n(\mathbb{R}) be the ring of n \times n matrices with real entries. A form of the following problem was posted on the Art of Problem Solving website a couple of days ago.

Problem. Show that if A,B \in M_n(\mathbb{R}) and A^2+B^2=AB, then \det(BA-AB) \ge 0.

Solution (Y. Sharifi). Let X=aA+B, \ Y=bA+B, where a=\frac{\sqrt{3}-1}{2}, \ b=\frac{-\sqrt{3}-1}{2}. Let i:=\sqrt{-1}. Then

(X+Yi)(X-Yi)=(aA+B+(bA+B)i)(aA+B-(bA+B)i)

=(a^2+b^2)A^2+2B^2+(a+b)(AB+BA)+(a-b)(BA-AB)i.

Thus, since a^2+b^2=2, \ a+b=-1, \ a-b=\sqrt{3}, and A^2+B^2=AB, we get that

\begin{aligned} (X+Yi)(X-Yi)=-(BA-AB)+\sqrt{3}(BA-AB)i=(-1+\sqrt{3}i)(BA-AB)=2e^{2\pi i/3}(BA-AB).\end{aligned}

Therefore

|\det(X+Yi)|^2=\det(X+Yi) \overline{\det(X+Yi)}=\det(X+Yi)\det(X-Yi)

=\det((X+Yi)(X-Yi))=\det(2e^{2\pi i/3}(BA-AB))=2^ne^{2n\pi i/3}\det(BA-AB),

which gives

\det(BA-AB)=2^{-n}e^{-2n\pi i/3}|\det(X+Yi)|^2. \ \ \ \ \ \ \ \ \ (*)

Now, if 3 \mid n, then e^{-2n\pi i/3}=1 and so (*) gives \det(BA-AB)=2^{-n}|\det(X+Yi)|^2 \ge 0. If 3 \nmid n, then e^{-2n\pi i/3} is not a real number. But both \det(BA-AB) and |\det(X+Yi)|^2 are real numbers, and hence, by (*), we must have \det(BA-AB)=0. So either way, \det(BA-AB) \ge 0. \ \Box

Rings in this post may or may not have identity. As always, the ring of n \times n matrices with entries from a ring R is denoted by M_n(R).

PI rings are arguably the most important generalization of commutative rings. This post is the first part of a series of posts about this fascinating class of rings.

Let C be a commutative ring with identity, and let C\langle x_1, \ldots ,x_n\rangle be the ring of polynomials in noncommuting indeterminates x_1, \ldots ,x_n and with coefficients in C. We will assume that each x_i commutes with every element of C. If n=1, then C\langle x_1, \ldots ,x_n\rangle is just the ordinary commutative polynomial ring C[x].
A monomial is an element of C\langle x_1, \ldots ,x_n\rangle which is of the form y_1y_2 \cdots y_k, where y_i \in \{x_1, \cdots , x_n\} for all i. The degree of a monomial y_1y_2 \cdots y_k is defined to be k. For example, x_1x_2x_1^3x_5^2 is a monomial of degree 7. So an element of f \in C\langle x_1, \ldots ,x_n\rangle is a C-linear combination of monomials, and we say that f is monic if the coefficient of at least one of the monomials of the highest degree in f is 1. For example, x_1^2+x_2-3x_2x_3x_2+2x_4^3 is not monic because none of the monomials of the highest degree, i.e. x_2x_3x_2 and x_4^3, have coefficient 1, but x_1^2+x_2-3x_2x_3x_2+x_4^3 is monic.

Definition 1. A ring R is called a polynomial identity ring, or PI ring for short, if there exists a positive integer n and a monic polynomial f \in \mathbb{Z}\langle x_1, \ldots ,x_n\rangle such that f(r_1, \cdots , r_n)=0 for all r_i \in R. We then say that R satisfies f or f is an identity of R.

Definition 2. Let C be a commutative ring with identity, and let R be a C-algebra. If, in Definition 1, we replace \mathbb{Z} with C, we will get the definition of a PI algebra. So R is called a PI algebra if there exists a positive integer n and a monic polynomial f \in C\langle x_1, \ldots ,x_n\rangle such that f(r_1, \cdots , r_n)=0 for all r_i \in R. Note that since every ring is a \mathbb{Z}-algebra, every PI ring is a PI algebra.

Example 1. Every commutative ring R is a PI ring.

Proof. Since R is commutative, r_1r_2-r_2r_1=0, for all r_1,r_2 \in R, and therefore R satisfies the monic polynomial f=x_1x_2-x_2x_1. \ \Box

Remark 1. A PI ring could satisfy many polynomials. For example a finite field of order q satisfies the polynomial in Example 1, because it is commutative, and it also satisfies the polynomial x^q-x. Another example is Boolean rings; they satisfy the polynomial in Example 1, because they are commutative, and they also satisfy the polynomial x^2-x.

Example 2. If C is a commutative ring with identity, then R:=M_2(C) is a PI ring.

Proof. Let A_1,A_2 \in R. Then \text{tr}(A_1A_2 - A_2A_1)=0 and so, by Cayley-Hamilton,

(A_1A_2 - A_2A_1)^2 = cI_2,

for some c \in C. Thus (A_1A_2-A_2A_1)^2 commutes with every element of R, i.e.,

(A_1A_2-A_2A_1)^2A_3-A_3(A_1A_2-A_2A_1)^2=0

for all A_1,A_2,A_3 \in R. Thus R satisfies the monic polynomial

f=(x_1x_2-x_2x_1)^2x_3 - x_3(x_1x_2 - x_2x_1)^2. \ \Box

Example 3. The division ring of real quaternions \mathbb{H} is a PI ring.

Proof. Recall that \mathbb{H}=\mathbb{R}+\mathbb{R}\bold{i}+\mathbb{R}\bold{j}+\mathbb{R}\bold{k}, where \bold{i}^2=\bold{j}^2=-1, \ \bold{ij}=-\bold{ji}=\bold{k}. Let

x=a+b\bold{i}+c\bold{j}+d\bold{k} \in \mathbb{H}, \ \ \ \ \ a,b,c,d \in \mathbb{R}.

It is easy to see that x^2-2ax+a^2+b^2+c^2+d^2=0. Thus, since a,b,c,d are in the center of \mathbb{H}, we get that y(x^2-2ax)=(x^2-2ax)y for all y \in \mathbb{H}. So yx^2-x^2y=2a(yx-xy) and hence, since a is central, (yx-xy)(yx^2-x^2y)=(yx^2-x^2y)(yx-xy) for all x,y \in \mathbb{H}. Therefore \mathbb{H} satisfies the monic polynomial f=(x_1x_2-x_2x_1)(x_1x_2^2-x_2^2x_1)-(x_1x_2^2-x_2^2x_1)(x_1x_2-x_2x_1). \ \Box

Remark 2. If R is a PI ring with identity, then R could satisfy a polynomial with a nonzero constant. For example, the ring \mathbb{Z}/n\mathbb{Z} satisfies the polynomial in Example 1, and it also satisfies the polynomial f=(x-1)(x-2) \cdots (x-n)+n. Now suppose that R has no identity, and f \in \mathbb{Z}\langle x_1, \ldots ,x_n\rangle has a nonzero constant m. So f=g + m, where g \in \in \mathbb{Z}\langle x_1, \ldots ,x_n\rangle has a zero constant. Now, what is f(r_1, \cdots , r_n), \ r_i \in R? Well, It is not defined because it is supposed to be g(r_1, \cdots , r_n)+m1_R but 1_R does not exist. So if R has no identity, all identities of R must have zero constants.

Example 4. Any subring or homomorphic image of a PI ring is a PI ring.

Proof. If R satisfies f, then obviously any subring of R satisfies f too. If S is a homomorphic image of R, then S \cong R/I for some ideal I of R. Now, it is clear that for any f(x_1, \cdots , x_n) \in \mathbb{Z}\langle x_1, \ldots ,x_n\rangle and any r_1, \cdots , r_n \in R, we have f(r_1+I, \cdots , r_n+I)=f(r_1, \cdots , r_n) + I. Hence if R satisfies f, then R/I satisfies f too. \ \Box

Example 5. If I is a nilpotent ideal of a ring R, and R/I is a PI ring, then R is a PI ring.

Proof. So I^m=(0) for some positive integer m. Now suppose that R/I satisfies a monic polynomial f(x_1, \cdots , x_n) \in \mathbb{Z}\langle x_1, \ldots ,x_n\rangle. Then

I=0_{R/I}=f(r_1+I, \cdots , r_n+I)=f(r_1, \cdots , r_n) + I

for all r_i \in R, and so f(r_1, \cdots , r_n) \in I. Thus (f(r_1, \cdots , r_n))^m \in I^m=(0) and hence R satisfies the monic polynomial f^m. \ \Box

Example 6. A finite direct product of PI rings is a PI ring.

Proof. By induction, we only need to prove that for a direct product two PI rings. So let R_1, R_2 be PI rings that, respectively, satisfy f,g \in \mathbb{Z}\langle x_1, \ldots ,x_n\rangle, and let R:=R_1 \times R_2. It is clear that for any polynomial h \in \mathbb{Z}\langle x_1, \ldots ,x_n\rangle and (r_i,s_i) \in R, \ 1 \le i \le n, we have

h((r_1,s_1), \cdots , (r_n,s_n))=(h(r_1, \cdots , r_n),h(s_1, \cdots , s_n)).

Therefore

f((r_1,s_1), \cdots , (r_n,s_n))g((r_1,s_1), \cdots , (r_n,s_n))=(f(r_1, \cdots , r_n)g(r_1, \cdots , r_n), f(s_1, \cdots , s_n)g(s_1, \cdots , s_n))

=(0_{R_1},0_{R_2})=0_R,

because f(r_1, \cdots , r_n)=0_{R_1} and g(s_1, \cdots , s_n)=0_{R_2}. So R satisfies the monic polynomial fg. \ \Box

Exercise. Let R be a ring with the center C. Suppose that for every r \in R, there exist a,b,c \in C such that r^3+ar^2+br+c=0. Show that R is a PI ring.
Hint. See the proof of Example 3.

Note. The reference for this post is Section 1, Chapter 13 of the book Noncommutative Noetherian Rings by McConnell and Robson. Example 3 was added by me.