Essential Math for Data Science
Do you want more math for data science and machine learning? I just released my book "Essential Math for Data Science"🎉.

GET THE BOOK

Introduction

I can assure you that you will read this chapter in 2 minutes! It is nice after the last two chapters that were quite big! We will see what is the Trace of a matrix. It will be needed for the last chapter on the Principal Component Analysis (PCA).

2.10 The Trace Operator

Calculating the trace of a matrix The trace of matrix

The trace is the sum of all values in the diagonal of a square matrix.

$ \bs{A}= \begin{bmatrix} 2 & 9 & 8 \\ 4 & 7 & 1 \\ 8 & 2 & 5 \end{bmatrix} $

\(\mathrm{Tr}(\bs{A}) = 2 + 7 + 5 = 14\)

Numpy provides the function trace() to calculate it:

A = np.array([[2, 9, 8], [4, 7, 1], [8, 2, 5]])
A
array([[2, 9, 8],
       [4, 7, 1],
       [8, 2, 5]])
A_tr = np.trace(A)
A_tr
14

GoodFellow et al. explain that the trace can be used to specify the Frobenius norm of a matrix (see 2.5). The Frobenius norm is the equivalent of the $L^2$ norm for matrices. It is defined by:

\(\norm{\bs{A}}_F=\sqrt{\sum_{i,j}A^2_{i,j}}\)

Take the square of all elements and sum them. Take the square root of the result. This norm can also be calculated with:

\(\norm{\bs{A}}_F=\sqrt{\Tr({\bs{AA}^T})}\)

We can check this. The first way to compute the norm can be done with the simple command np.linalg.norm():

np.linalg.norm(A)
17.549928774784245

The Frobenius norm of $\bs{A}$ is 17.549928774784245.

With the trace the result is identical:

np.sqrt(np.trace(A.dot(A.T)))
17.549928774784245

Since the transposition of a matrix doesn’t change the diagonal, the trace of the matrix is equal to the trace of its transpose: \[\Tr(\bs{A})=\Tr(\bs{A}^T)\]

Trace of a product

\[\Tr(\bs{ABC}) = \Tr(\bs{CAB}) = \Tr(\bs{BCA})\]

Example 1.

Let’s see an example of this property.

$ \bs{A}= \begin{bmatrix} 4 & 12 \\ 7 & 6 \end{bmatrix} $

$ \bs{B}= \begin{bmatrix} 1 & -3 \\ 4 & 3 \end{bmatrix} $

$ \bs{C}= \begin{bmatrix} 6 & 6 \\ 2 & 5 \end{bmatrix} $

A = np.array([[4, 12], [7, 6]])
B = np.array([[1, -3], [4, 3]])
C = np.array([[6, 6], [2, 5]])

np.trace(A.dot(B).dot(C))
531
np.trace(C.dot(A).dot(B))
531
np.trace(B.dot(C).dot(A))
531

$ \bs{ABC}= \begin{bmatrix} 360 & 432 \\ 180 & 171 \end{bmatrix} $

$ \bs{CAB}= \begin{bmatrix} 498 & 126 \\ 259 & 33 \end{bmatrix} $

$ \bs{BCA}= \begin{bmatrix} -63 & -54 \\ 393 & 594 \end{bmatrix} $

\(\Tr(\bs{ABC}) = \Tr(\bs{CAB}) = \Tr(\bs{BCA}) = 531\)

References

Feel free to drop me an email or a comment. The syllabus of this series can be found in the introduction post. All the notebooks can be found on Github.

This content is part of a series following the chapter 2 on linear algebra from the Deep Learning Book by Goodfellow, I., Bengio, Y., and Courville, A. (2016). It aims to provide intuitions/drawings/python code on mathematical theories and is constructed as my understanding of these concepts. You can check the syllabus in the introduction post.

Essential Math for Data Science
Do you want more math for data science and machine learning? I just released my book "Essential Math for Data Science"🎉.

GET THE BOOK

✠  Previous Deep Learning Book Series · 2.9 The Moore Penrose Pseudoinverse ✠  Next Deep Learning Book Series · 2.11 The determinant