Linear Algebra Review
August 1, 2017
notes study reviewMatrix
Dimension of matrix: number of rows x number of columns
Notation for elements of a matrix, \(A\) (or other capital letter, e.g. \(B, C\)):
\[A_{ij} \: \text{refers to the} \: i, j \: \text{entry in the} \: i^{th} \text{row,} \: j^{th} \text{column.}\]
For example \(A_{12}\) refers to element in row 1, column 1. \(A_{43}\) refers to element in row 4, column 3.
Vector
Vector is an \(n\) x 1 matrix
For example, a vector where \(n\) is 4, is referred to as a 4-dimensional vector. This can also can be written as \(\text{R}^4\). [NOTE: the \(\text{R}\) is supposed to represent the real coordinate space, often written as ℝ, but I couldn’t figure out how to get it in Latex without the amsfonts/amssymb package].
Notation for elements of a vector, \(y\) (or other lowercase letter, e.g. \(a, b\)):
\[y_i \: \text{refers to the} \: i^{th} \: \text{element.}\]
Because no one likes to keep things straightforward, the elements of a vector can be either \(1\)-indexed or \(0\)-indexed.
Matrix Addition/Substraction
alt text
In order for matrix addition (or subtraction) to work, the matrices need to be of the same dimension, e.g. 3x3 matrix + 3x3 matrix as shown above. If you try to add matrices of different dimensions, it will generate an “error.”
Matrix Multiplication
Scalar Multiplication/Division
The scalar is multiplied by each element of the matrix, e.g. the scalar, \(c\) is multiplied by \(c \cdot a_1, c \cdot a_2, ..., c \cdot a_n\). Division functions the same way, i.e. each element of the matrix is divided by the scalar – \(a_1/c, ..., a_n/c\).
alt text
When dealing with scalars, the order – e.g. \(c \cdot A\) or \(A \cdot c\) – gives the same output.
Matrix Multiplication
When dealing with matrix multiplication involving matrix of dimension \(m \: \text{x} \: n\) and another matrix of dimension \(n \: \text{x} \: p\), \(n\) must be the same value. In other words, in order to perform matrix multiplication, the number of columns (\(n\)) on the Left matrix (\[m \: \text{x} \: n\]) should equal the number of rows (\(n\)) on the Right matrix (\(n \: \text{x} \: p\)). For example, 2 x 3 matrix times 3 x 2 matrix gives a matrix of 2 x 2 (\(m \: \text{x} \: p\)).
alt text
Neat trick to apply hypothesis function – \(h_\theta(x) = \theta_0 + \theta_1x^{(i)}\) – to available training examples, \(m\). - Convert training example to a matrix - Convert hypothesis to a vector - Multiply matrix of training examples by vector of your hypothesis: \(\text{prediction} = \text{Data Matrix} * \text{Parameters}\) - This provides a very computationally efficient method when making predictions based on our training data and hypothesis.
Example:
House sizes: 2104, 1416, 1534, 852
3 competing hypotheses:
- \(h_\theta(x) = -40 + 0.25 \cdot x\)
- \(h_\theta(x) = 200 + 0.1 \cdot x\)
- \(h_\theta(x) = -150 + 0.4 \cdot x\)
House sizes matrix:
\[ \begin{bmatrix} 1 & 2104 \\ 1 & 1416 \\ 1 & 1534 \\ 1 & 852 \end{bmatrix} \begin{bmatrix} -40 & 200 & -150 \\ 0.25 & 0.1 & 0.4 \end{bmatrix} = \begin{bmatrix} 486 & 410 & 692 \\ 314 & 342 & 416 \\ 344 & 353 & 464 \\ 173 & 285 & 191 \end{bmatrix} \]
# Can simply go through training set w/ few lines of code
for i = 1:m:
prediction(i) = ...
Properties of Matrix Multiplication
- Commutative property – :thumbsdown:
- \(a \times b = b \times a\)
- Associative property – :thumbsup:
- \(a \times (b \times c) = (a \times b) \times c\)
Commutatitve Property :weary:
Commutative property is (generally) not upheld with multiplication of matrices. Where \(A\) and \(B\) are matrices, \(A \times B \neq B \times A\). The exception is if \(B\) is the identity matrix, \(I\) (discussed further below).
Examples:
\[ \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix} \quad \begin{pmatrix} 0 & 0 \\ 2 & 0 \end{pmatrix} \quad = \quad \begin{pmatrix} 2 & 0 \\ 0 & 0 \end{pmatrix} \]
\[ \begin{pmatrix} 0 & 0 \\ 2 & 0 \end{pmatrix} \quad \begin{pmatrix} 1 & 1 \\ 0 & 0 \end{pmatrix} \quad = \quad \begin{pmatrix} 0 & 0 \\ 2 & 0 \end{pmatrix} \]
Additionally, lets assume \(A\) is a matrix of \(m \times n\) dimensions and \(B\) is a matrix of \(n \times m\) dimensions. The output matrix of \(A \times B\) will produce a matrix output of \(m \times m\) dimensions, whereas \(B \times A\) will give a matrix of \(n \times n\) dimensions.
Associative Property :ok_hand:
Associate property is upheld in matrix multiplication. In other words, for matrices \(A, B, C\), \(A \times B \times C = A \times (B \times C) = (A \times B) \times C\).
Identity Matrix :point_up:
Denoted as \(I\) or \(I_{n \times n}\).
Has the property that for any matrix, \(A\): \(A \cdot I = I \cdot A = A\)
However, it is important to note that in the above property, the dimensions of \(I\) will vary based on if \(A \cdot I\) compared to \(I \cdot A\). At first glance that seems strange, but remember our previously established principle with matrix multiplication dimensions – \(m \times n \: \cdot \: n \times p \: = \: m \times p \: \text{dimensions}\).
\(A \cdot I\) - \(A\) has \(m \times n\) dimensions - \(I\) then takes \(n \times n\) dimensions to satisfy “\(m \times p\)” as above
\(I \cdot A\) - \(A\) again takes \(m \times n\) dimensions - \(I\) will have to take \(m \times m\) dimensions to maintain the “\(m \times p\)” principle of our matrix multiplication output.
Examples:
\[ \begin{pmatrix} 1 & 0 \\ 0 & 1 \end{pmatrix} \quad \begin{pmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} \quad \begin{pmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{pmatrix} \]
Inverse Matrix
The inverse matrix is that matrix, which, when multiplied by a matrix, gives output as the identity matrix, \(I\).
\(A \cdot A^{-1} = A^{-1} \cdot I = I\), where \(A\) is a square matrix, i.e. has dimensions \(m \times m\) and \(A\) has an inverse, \(A^{-1}\).
Do all numbers have an inverse? No, e.g. \(0\) does not have an inverse. \(0^{-1}\) is undefined.
Calculating the Inverse matrix can be done easily in R, Octave, etc. using a function such as pinv(A), ginv(A), or other inverse calculation functions.
“Singular”/“Degenerate” matrices are those matrices which do NOT have an inverse. An example would be a matrix of all \(0\)s.
\[ \begin{pmatrix} 0 & 0 \\ 0 & 0 \end{pmatrix} \]
Transpose Matrix
Transpose matrix, denoted as \(A^T\), is where the rows of your matrix, \(A\), are changed to columns.
More formally :point_right: let \(A\) be an \(m \times n\) matrix, and let \(B = A^T\). Then \(B\) is an \(n \times m\) matrix, and \(B_{ij} = A_{ji}\).
Example:
\[ A = \begin{pmatrix} 1 & 2 & 0 \\ 3 & 5 & 9 \end{pmatrix} \quad \quad A^T = \begin{pmatrix} 1 & 3 \\ 2 & 5 \\ 0 & 9 \end{pmatrix} \]
To verify that \(B_{ij} = A_{ji}\), you can see that \(B_{12} = A_{21}\).