linear-algebra
Basic Idea
Basic Operator
Vector
Inner product
\[\begin{align} \vec{x} \cdot \vec{y} = \vec{x}^{\intercal} \vec{y} = \sum_{i=1}^{d} x_{i}y_{i} & & \vec{x}, \vec{y} \in \mathbb{R}^{d} \end{align}\]Outer product
\[\begin{align} \vec{x} \times \vec{y} = \vec{x} \vec{y}^{\intercal} = \begin{bmatrix} \begin{array} (x_{1}y_{1} & \cdots & x_{1}y_{d} \\ \vdots & \ddots & \vdots \\ x_{d}y_{1} & \cdots & x_{d}y_{d} \end{array} \end{bmatrix} & & \vec{x}, \vec{y} \in \mathbb{R}^{d} \end{align}\]\(L_p\) Norm
\[\begin{align} \lVert \vec{x} \rVert_p = ( \sum_{i=1}^{d} \lvert x_i \rvert ^p )^{\frac{1}{p}} & & \vec{x} \in \mathbb{R}^d \end{align}\]For \(p=1\), it degrades to Manhattan Norm
\[|\vec{x}| = \sum_{i=1}^{d} \lvert x_i \rvert\]For \(p = 2\), it degrades to Euclidean Norm
\[\lVert \vec{x} \rVert = \sqrt{ \sum_{i=1}^{d} x_i^2}\]And Euclidean Norm can be presented with inner product
\[\lVert \vec{x} \rVert^2 = \vec{x} \cdot \vec{x}\]For \(p = 0\), it defines the number of non-negative entries. Be care, it is not a real norm! \(\lVert \vec{x} \rVert_0 = |\{i | \beta_i \ne 0\}|\)
Orthogonal vector
\[\begin{cases} \vec{x} \cdot \vec{y} = 0 \\ cos(\vec{x}, \vec{y}) = 0 \end{cases} \Rightarrow \text{ x, y are a pair of orthogonal vectors}\]Orthonormal vector
\[\begin{cases} \vec{x} \cdot \vec{y} = 0 \\ cos(\vec{x}, \vec{y}) = 0 \\ \lVert \vec{x} \rVert = \lVert \vec{y} \rVert = 1 \end{cases} \Rightarrow \text{ x, y are a pair of orthonormal vectors}\]Basic inequality
Cauchy-Schwarz inequality
\[\lvert \vec{x} \cdot \vec{y} \rvert \le \lVert \vec{x} \rVert \lVert \vec{y} \rVert\]Triangle Inequality
\[\lVert \vec{x} - \vec{y} \rVert \le \lVert \vec{x} \rVert + \lVert \vec{y} \rVert\]Angle
\[\cos(\vec{x}, \vec{y}) = \frac{\vec{x} \cdot \vec{y}}{\lVert \vec{x} \rVert \lVert \vec{y} \rVert}\]Matrix
Matrix product
Numerical view
\[\begin{align} [A B]_{ij} = \sum_{l = 1}^{k} a_{il}b_{lj} &&A \in \mathbb{R}^{n \times k}, B \in \mathbb{R}^{k \times d} \end{align}\]Row vector view
\[\begin{align} [AB]_{i\cdot} = \sum_{l=1}^{k} a_{il} B_{l \cdot} & & A \in \mathbb{R}^{n \times k}, B \in \mathbb{R}^{k \times d} \end{align}\]It means each row \(i\) in the final matrix is the combination of all rows in matrix B with weight \(A_{i \cdot}\)
Column vector view
\[\begin{align} [AB]_{\cdot j} = \sum_{l=1}^{k} A_{\cdot l} b_{lj} & & A \in \mathbb{R}^{n \times k}, B \in \mathbb{R}^{k \times d} \end{align}\]It means each column \(j\) in the final matrix is the combination of all column in matrix \(A\) with weight \(B_{\cdot j}\)
Frobenius norm
\[\lVert A \rVert_F = \lVert A^\intercal \rVert_F = \sqrt{ \sum_{i = 1}^{n} \sum_{j=1}^{d} a_{ij}^2 }\]\[\lVert \vec{x} \times \vec{y} \rVert_F = \lVert \vec{x} \rVert \lVert \vec{y} \rVert\]It is a concrete case of element-wise norm. Also, it is a concrete case of \(L_{p, q}\) norm.
Trace
\[\begin{align} tr(A) = \sum_{i=1}^{n} a_{ii} & & A \in \mathbb{R}^{n \times n} \end{align}\]Equality about trace
\[\lVert A \rVert_{F}^{2} = Energy(A) = tr(AA^\intercal) = tr(A^\intercal A)\] \[\begin{align} tr(CD^\intercal) = tr(DC^\intercal) = \sum_{i=1}^n \sum_{j=1}^{d} c_{ij}d_{ij} \\ tr(AB) = tr(BA) = \sum_{i=1}^n \sum_{j=1}^{d} a_{ij}b_{ji} \end{align}\]Rank
- TODO()
Matrix Decompose
Decompose a complex matrix into the product of many simple matrices.
Elementary matrix operator
- Interchange
- Addition
- Scaling
By using the row and column view of matrix product
Examples (row operators):
Interchange | Addition | Scaling | |||
---|---|---|---|---|---|
\(\begin{bmatrix} \begin{array} (0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \\ \end{array} \end{bmatrix}\) | \(\begin{bmatrix} \begin{array} (1 & c & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \\ \end{array} \end{bmatrix}\) | \(\begin{bmatrix} \begin{array} (1 & 0 & 0 \\ 0 & c & 0 \\ 0 & 0 & 1 \\ \end{array} \end{bmatrix}\) | |||
Interchange rows 1, 2 | Add \(c \times\) (row 2) to row 1 | Multiply row 2 by c |
Geometry decomposable operator
Designed to transform a column vector \(\boldsymbol{\vec{x}}\)
- Rotation
- Reflection
- Scaling
Examples (2-dimension):
Interchange | Addition | Scaling | |||
---|---|---|---|---|---|
\(\begin{bmatrix} \begin{array} ( \cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta) \end{array} \end{bmatrix}\) | \(\begin{bmatrix} \begin{array} (1 & 0 \\ 0 & -1 \end{array} \end{bmatrix}\) | \(\begin{bmatrix} \begin{array} (c_1 & 0 \\ 0 & c_2 \end{array} \end{bmatrix}\) | |||
Rotate counter-clockwise by \(\theta\) | Reflect across X-axis | Scale \(x\) and \(y\) by factor \(c_1\) and \(c_2\) |
Matrix factorization (Optional)
Linear Transformation
Linear Transform
A vector-to-vector function \(f(\vec{x})\) defines a linear transform of \(\vec{x}\), if it follows:
\[\begin{align} f(c \vec{x}) & = c \cdot f(\vec{x}) \\ f(\vec{x} + \vec{y})& = f(\vec{x}) + f(\vec{y}) \end{align}\]Translation
\[f(\vec{x}) = \vec{x} + \vec{b}\]Affine Transform
an affine transform is a combination of a linear transform with a translation
A vector-to-vector function \(f(\vec{x})\) defines a linear transform of \(\vec{x}\), for any scalar \(\lambda\) if it follows:
\[f(\lambda \vec{x} + (1-\lambda)\vec{y}) = \lambda f(\vec{x}) + (1-\lambda)f(\vec{y})\]All linear transforms are special cases of affine transforms, but not vice versa.
Any linear mapping \(f(x)\) from d-dimensional vectors to n-dimensional vectors can be represented as the matrix-to-vector product \(A\vec{x}\).