Last time, we learned a special matrix called the diagonal matrix. We knew that if a matrix is a diagonal matrix, everything will be extremely easy to solve. So, is it possible to make all the matrix into diagonal ones? If not, is there any lucky matrix that we can convert?
Now you might get confused by what I said. What do you mean by converting a matrix? Well, in section 6-4, we learned that the same transformation can have different matrix representations on different bases. Maybe we can use this fact as our advantage to see what we can do.
Our goal is to see if we can find a specific basis such that the transformation(matrix) in this basis became a diagonal matrix. Obviously, this is not easy. I mean, how would we even start? Well, we can start from the diagonal matrix since that seems easy.
If you have a diagonal matrix, what is the transformation that corresponding to a diagonal matrix? Let’s look at an example:
if you take this matrix and transform a vector \((x,y)\), you will get:
If you look at the vector, this is just the orginal vector but the y-component doubled. In general, if you have a diagonal matrix:
If you look carefully, you will see that the transformation is nothing but stretch or compressed along the direction of basis (in this case, x or y direction). But hold on, haven’t we learned something that is similar? The eigenvalue and eigenvector!
Let’s do a quick review on section 7-1. For some matrices, you can find that alone certain direction, the transformation does not change the direction of the input vector:
The vectors lie in those special directions are called the eigenvector of the matrix and the coefficients that tell you how the vectors are being stretched or compressed are called the eigenvalue.
You can sense that the pieces are coming together. Let use a \(2\times2\) matrix as an example:
Previously, we find that the eigenvectors are:
and the corresponding eigenvectors are \(1\) and \(2\).
Here comes the brilliant idea: the eigenvectors are linearly independent so these two vectors are enough to span the entire space. If we change the basis from the original Cartesian basis to the basis that formed by eigenvectors, what would happen?
Now the transformation has to be described by our new basis. Since this is nothing but another linear transformation, and we know that the columns for the linear transformation matrix are nothing but the basis vectors after the transformation:
So what would be the basis vectors after the transformation? Well, we know that our new basis vectors are all eigenvectors. So, the transformation will be nothing but multiply the corresponding eigenvalues:
Then what will be our transformation matrix on the new basis?
And that my friend, is a diagonal matrix!
So what does it mean? It means that if we can transform our matrix to the new “eigenbasis”, then the transformation has to be a diagonal matrix!
Now how do we change the matrix to a new basis? We do:
For our case, we will start which matrix \(A\), end up with matrix \(D\):
What will be our matrix \(P\)? Well, matrix \(P\) is the matrix formed by new basis vectors:
In this case, the basis vectors have to be the eigenvectors of matrix \(A\). The matrix \(D\) has to be the diagonal matrix where all the diagonal terms has to be the corresponding eigenvalues:
The process \(D = P^{-1}AP\) or we can write it as \(A = PDP^{-1}\) is called Diagonalization. Obviously, not all the matrix is diagonalizable. To be able to diagonalize, the matrix has to have enough eigenvectors to form a basis. In our example, we can see that we have two eigenvectors that perfectly span the space. You can also realize that by looking at the expression:
If the matrix \(P\) is not invertible, then we are not able to write down \(P^{-1}\). And being able to have a rank equal to the dimension of the matrix is essential for finding an inverse.
Now, we can see that why the eigenvectors are so useful. They are the basis that we are looking for to convert the matrices into a diagonal one. And the eigenvalues are the components of our diagonal matrix.