# Linear algebra cheat-sheet

Ok, this one is about vectors and matrices, well, not the matrix from the picture. Not yet…

It is a cheat-sheet, not a course. It is intended to be a reference for who already know it or what to refresh his mind. It can be a good test. Do you remember these concepts ?

It starts from basic high school concepts and goes through master ones. I start by my notes from the DSTI classes and I will add new concepts over time. Feel free to propose things to be added in the comments.

Unless otherwise specified :

- We are in ℝⁿ
- Indexes are positive integers: 0,…,i,j,k,…,m,n
- Scalars are in ℝ: a, b, ɑ, ꞵ, ɣ, λ, λᵢ, λᵢⱼ,
- In bold:
**x**is a vector of coordinates xᵢ with i in [1,n] - In uppercase: X is a matrix of values xᵢⱼ with i,j in [1,n],[1,m]
- I is the identity matrix of order n
- ◰ = the matrix is square.
- ⬕ or ⬔ = the matrix is triangular.
- ⧅ = the matrix is diagonal.
- ▧ = the matrix is symmetric

# Vectors

*Tip: create a vector with code*

# R

c(1, 2, 3)# Python

import numpy as np

np.array([1,2,3])# Julia

[1 2 3]

## Linear vector space

*Additivity:*

*Homogeneity of degree 1:*

*Dot product:*

With θ the angle between the vectors: x.y = |x||y| cos(θ)

Properties of the dot product:

- Additive: (
**x**+**y**)**⋅ z**=**x ⋅ z**+**y ⋅ z** - Associative: (a
**x**)**⋅ y**= a (**x ⋅ y**) - Commutative:
**x ⋅ y**=**y ⋅ x**

*Tip: compute the dot product*

# R

v %*% w# Python

np.dot(v,w)# Julia

v * w

## Orthogonal vectors

## Euclidean norm of a vector

This is the “length” of the vector. We can calculate it as the square root of its dot product by itself. A norm is strictly positive when **x** ≠ 0 and zero else.

*Tip: compute the norm*

# R

norm(v)# Python

np.linalg.norm(v)# Julia

norm(v)

## Unit vector

Same direction, norm equal 1:

## Relations between the dot product and matrices

With Xᵀ the transpose of the one row matrix that represent the vector **x**. and Y the row matrix representing the vector **y**.

We can move a matrix at the other side:

Since, using: **x **. **y** = Xᵀ Y and (AB)ᵀ = Aᵀ Bᵀ:

(A**x**) **⋅ y** = (AX)ᵀ Y =(Xᵀ Aᵀ) Y =Xᵀ (Aᵀ Y) = **x** **⋅ (**Aᵀ **y**)

## Derivative of a dot product

if **x(t)** and **y(t)** are derivable functions of t:

## Linear independence

Vectors **x**₁ … **x**ₙ are linearly independent if the only scalars αᵢ that satisfy the above equation are 0.

α₁ **x**₁ + … + αₙ **x**ₙ = 0 ⇒ α₁ = . . . = αₙ = 0

*We will see later that an easy method to verify programatically that a set of vectors is linearly independent is to concat the vectors as a matrix and check if the determinant is equal to 0.*

# Matrix types

## General form

Indexes n and m being integers:

*Tip: the **Latex** formula to generate this figure is:*

`\begin{bmatrix}`

m_{1,1} & m_{1,2} & \cdots & m_{1,n} \\

m_{2,1} & m_{2,2} & \cdots & m_{2,n} \\

\vdots & \vdots & \ddots & \vdots \\

m_{m,1} & m_{m,2} & \cdots & m_{m,n}

\end{bmatrix}

*Tip: create a matrix with code*

# R

matrix(c(1,2,3,4,5,6), 3, 2)

dim(c(1,2,3,4,5,6)) <- c(3,2)# Python

import numpy as np

np.array([[1,4],[2,5],[3,6]])# Julia

[1 2 3; 4 5 6] # by columns

[1 4; 2 5; 3 6] # by rows

## Triangular matrix ◰⬔

The upper or the lower part after the diagonal is filled by zeros.

## Diagonal matrix ◰⧅

The upper and the lower part after the diagonal are filled by zeros. Diagonal matrices tends to make calculus way easier.

Diagonal matrices have interesting properties:

## Identity matrix ◰⧅

This is a diagonal matrix of ones.

## Symmetric matrix ◰▧

The upper and the lower part aside the diagonal are identical.

## Anti-symmetric matrix ◰

Same, with the signs inverted on each side.

## Idempotent and periodic matrices ◰

A periodic matrix has an integer k so that:

If k = 1, the matrix is said to be idempotent:

More special matrices on MathWorld.

# Matrices of derivatives

**The gradient**

The gradient (symbol “nabla”: ∇) is the vector of the first order derivatives of a multivariable function. The gradient is pretty useful, since it points in the direction of the highest increase.

*Tip: compute the gradient on one defined vector/point ***v*** of a function f*

# R

gradient(f, v)# Python

import numdifftools as nd

nd.Gradient(f)(v)# Julia

import ForwardDiff: gradient

gradient(f, v)

## The Hessian matrix

The Hessian matrix (symbole H or ∇²) is the matrix of the second order derivatives of a multivariable function.

What can also be noted:

*Tip: compute the Hessian matrix on one defined vector/point ***v*** of a function f*

# R

hessian(f, v)# Python

import numdifftools as nd

nd.Hessian(f)(v)# Julia

import ForwardDiff: hessian

hessian(f, v)

## The Jacobian matrix

If **f** is a vector of functions fⱼ(x₁,…,xₙ) with j in [1,…,m], The Jacobian matrix is the matrix of all 1st-order partial derivatives.

*Tip: compute the J*acobian *matrix on one defined matrix/point M of a set of multivariable functions F*

# R

jacobian(f, v)# Python

import numdifftools as nd

nd.Jacobian(f)(v)# Julia

import ForwardDiff: jacobian

jacobian(F, M)

# Matrix algebra

## Addition

Matrix addition can be done simply by adding each values at the same i,j index. The matrix have to have the same shape. This is also valid for the difference.

What implies that the following properties are valid:

## Multiplication by a scalar

In the same way, just multiply each value by the scalar λ.

## Matrices multiplication

The matrix can be multiplied if the shapes can chain:

For AB to be possible we need shape A = l,** m** & shape B =

**,n**

*m*The formula may sound a bit weird:

We actually make the dot product in each column and rows in regard.

A nice illustration from Wikimedia Commons:

The following properties are valid:

Beware, the multiplication is not commutative (it may not even be possible unless the matrices are square).

*Tip: compute the dot product*

# R

A %*% B# Python

np.dot(A,B)# Julia

A * B

## Matrix transposition

Mᵀ is the transpose of M such as row become columns and columns become rows.

The following properties are valid:

*Tip: compute the transpose*

# R

t(A)# Python

A.T# Julia

A'

**Property of the transpose and the dot product:**

# Trace of a matrix ◰

The trace is simply the sum of the values on the diagonal.

With the properties:

*Tip: compute the trace*

# R

sum(diag(A))# Python

np.linalg.trace(A)# Julia

trace(A)

# Determinant of a matrix ◰

## Matrix of order 2

## Matrix of order 3

We take a row or column (here the first row) and compute the determinants deleting the row and column of each value, then multiply by the value. We apply the sign of the matrix of signs.

Here is the 3x3 matrix of signs.

## Matrix of order n

At order n, we can use the same method, by calculating the cofactor deleting a row & column in the same way. The result of each operation is called a minor determinant Mᵢⱼ. When the sign is added, it is called a cofactor Cᵢⱼ. The process has to be repeated until we reach rank 2 Mᵢⱼ determinants that can be calculated. Since the method is extremely costly at high orders, numerical methods are usually prefered.

## Graphical interpretation of the determinant

For a 2x2 matrix, the determinant is the change in the area of the square made by the 2 basis’ vectors, when the transformation is applied. Similarly, for 3x3 matrices, it the change in the volume of the initial cube drawn by the 3 basis’s vectors.

## Special values of the determinant

- When det A = 0, we have a singular matrix. Its rank is < n. It has no inverse. It represents a linear system when there is an infinity of solutions.
- When det A ≠ 0, the matrix has an inverse, its columns/rows are linearly independants, its rank is n and its range is ℝⁿ.
- When det A = ±1, we have an unimodular matrix.

## Properties of determinants

**Calculus rules**

*Beware, the determinant is not linear.*

- Distributive: det AB = det A . det B
- n-Homogeneous: det aA = aⁿ det A
*As a consequence*: det -A = (-1)ⁿ det A**Not****additive**yet: det A+B > det A + det B

**Determinants after operations**

- det Aᵀ = det A
- det A⁻¹ = 1 / det A
- det PAP⁻¹ = det A
- det (PAP⁻¹ - λI) = det (A - λI)
- det(I + εA) = 1 + ε MatrixTrace(A) + O(ε²) if ε is small
- det Aᶜ = (det A)ᶜ
*with ᶜ the complex conjugate*

**Determinants of special matrices**

- det I = 1
- If A triangular, det A = ∏ aᵢᵢ
- If A has a column or a row of zeros, det A = 0

*Tip: code*

# R

det(A)# Python

np.linalg.det(A)# Julia

det(A)

# Inverse of a Matrix ◰

## 2x2 matrix

## 3x3 matrix

C is the matrix of cofactors. It’s transpose is also called the adjoint matrix. For more details.

## Properties of inverses

## Solving a system of equations with A⁻¹

A system of equation can be written in a matrix form:

In short:

Then, we can solve it using the inverse matrix:

This is more costly than solving the system once, so this is mostly useful when we need to solve the same system several time.

*Tip: code*

# R

library(matlib)

inv(A)# Python

numpy.linalg.inv(A)# Julia

inv(A)

# Matrix diagonalisation ▧ ⇨ ⧅

Diagonal matrices are way easier to work with, since many operations are made possible if the matrix is diagonal:

Eigenvalues and eigenvectors are used for many purposes, including matrix diagonalisation. There, eigenvectors form an orthonormal basis that is the new basis after the transformation.

## General process

Eigenvectors are vectors **q** so that there is eigenvalues λ that are solutions of:

Since this is equivalent to (A-λI)**q**=0 and q is not zero, the determinant of A-λI is 0. We can determine possible values of eigenvalues λ by solving the **characteristic equation**:

Then we come back to the original equation and we solve the system for each possible value of λ and take non-zero solutions. This gives us the eigenvectors. Note that there is an infinity of solutions. The norm and sign of each **q**ᵢ could be chosen freely. Another interesting property is that eigenvectors are normal to each other. So, now that we know the eigenvalues, lets solve:

By the end, we can normalize the eigenvectors **q** with **p** = **q/**|**q**|, to create an orthonormal basis and put them together in a matrix P as columns:

Since P is orthogonal, we have:

Also, we can put all the eigenvalues in a diagonal matrix:

The matrix P can be used to change the coordinates and get into a basis were the original matrix A is equivalent to the diagonal matrix Λ** **:

The matrix P apply a change of coordinates to transform A into a diagonal matrix. This is actually a simple rotation.

We can use this to solve linear systems. We can define a new variable Λ**y** that will work in this new basis.

It also appear as a kind of principal components analysis. We rotate the system and then we can keep only the high eigenvalues. The others being considered as near-zero, so basically representing noise.

## Calculus for order 2

What can be written:

This solves as:

*To ease the process in some “nice” situations, it is worth mentioning that:*

We can now substitute the eigenvalues in the equation to find the eigenvectors.

What mean:

We repeat for q₂, normalize, and we have the P = [p₁ p₂] matrix.

## Tip for order 3

At order 3 the zero of the characteristic equation has the form:

## Properties of eigenvalues and eigenvectors

- The vectors
**p**ᵢ of P are linearly independents. - A is singular ⇔ rank A < n ⇔ ∃i such as λᵢ = 0
- A is not singular ⇔ rank A =n ⇔ λᵢ ≠ 0 ∀i
- λ is an eigenvalue of A ⇒ ɑλ is an eigenvalue of ɑA
- λ is an eigenvalue of A ⇒ λᵏ is an eigenvalue of Aᵏ with k >0
- λ is an eigenvalue of A ⇒ λ⁻¹ is an eigenvalue of A⁻¹
- λ is an eigenvalue of A ⇒ λ is an eigenvalue of Aᵀ
- A is triangular ⇒ λᵢ = Aᵢᵢ

*Tip: computing eigenvalues & eigenvectors*

# R

eigen(A)# Python

import numpy as np

np.linalg.eig(A)# Julia

eigvals(A)

eigvecs(A)

# Quadratic forms

Quadratic form are degree 2 polynomial functions of n variables.

This can be written as a matrix multiplication, with A the matrix of values aᵢⱼ and **x** the vector of values xᵢ:

Since xᵢxⱼ = xⱼxᵢ, where i≠j, quadratic forms are usually noted ꞵᵢⱼxᵢxⱼ with ꞵᵢⱼ = (aᵢⱼ +aⱼᵢ). So that we don’t know the value of aᵢⱼ and aⱼᵢ when i ≠ j, to build the matrix. Actually, we can take what we want as long as the sum is ꞵᵢⱼ. For ease of calculus, we create this matrix so that it is symmetric, putting (aᵢⱼ +aⱼᵢ)/2 on each side of the symmetry axis.

This matrix can be diagonalised with the method of the above section.

## Types of quadratic forms

If λᵢ are the eigenvalues λᵢ of A:

**x**ᵀA**x**> 0 ∀x≠0 ⇒ A positive definite ⇔ all λᵢ > 0**x**ᵀA**x**≥ 0 ∀x≠0 ⇒ A positive semi-definite ⇔ all λᵢ ≥ 0**x**ᵀA**x**< 0 ∀x≠0 ⇒ A negative definite ⇔ all λᵢ < 0**x**ᵀA**x**≤ 0 ∀x≠0 ⇒ A negative semi-definite ⇔ all λᵢ ≤ 0- If
**x**ᵀA**x**is positive & negative depending on x, A is indefinite. What mean that there is both positives and negatives eigenvalues. - If A is definite, A⁻¹ exists, is definite and has the same sign.

**Nice ressources on the internet**

**Algebra basics**on Math is Fun.**Essence of linear algebra**. An excellent serie of videos showing graphical interpretations of linear algebra.**Linear Algebra cheat-sheets**on MathWorld wiki.**List of linear algebra topics**on Wikipedia.