# Numerical Linear Algebra ## 1 Fundamental Linear Algebra ### 1.1 Vector Spaces **Definition 1.1.1 **  A **Vector Space** is a set $V$ equipped with addition and scalar multiplication with the following properties 1. *Commutativity*: $\mathbf{v}+\mathbf{w}=\mathbf{w}+\mathbf{v}$, $\forall\ \mathbf{v},\mathbf{w}\in V$ 2. *Associativity*: $(\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u}+(\mathbf{v}+\mathbf{w})$, $\forall\ \mathbf{u},\mathbf{v},\mathbf{w}\in V$ 3. *Zero Vector*: $\exists\ \mathbf{0}\in V$ such that $\mathbf{v}+\mathbf{0}=\mathbf{v}$, $\forall\ \mathbf{v}\in V$ 4. *Additive inverse*: For every $\mathbf{v}\in V$, there exists $-\mathbf{v}\in V$ such that $\mathbf{v}-\mathbf{v}=\mathbf{0}$ 5. *Multiplicative Identity*: $1\mathbf{v} = \mathbf{v}$ for all $\mathbf{v}\in V$ 6. *Multiplicative Associativity*: $(\alpha\beta)\mathbf{v}$ = $\alpha(\beta\mathbf{v})$ for all $\mathbf{v}\in V$ and scalars $\alpha, \beta$ 7. *Additive Closure*: $\mathbf{v}+\mathbf{w}\in V$ for all $\mathbf{v},\mathbf{w}\in V$ 8. *Multiplicative Closure*: $\alpha\mathbf{v}\in V$ for all $\mathbf{v}\in V$ and scalars $\alpha$ 9. $\alpha(\mathbf{u}+\mathbf{v}) = \alpha\mathbf{u} + \alpha\mathbf{v}$ for all $\mathbf{u},\mathbf{v}\in V$ and scalars $\alpha$ 10. $(\alpha + \beta)\mathbf{v}=\alpha\mathbf{v}+\beta\mathbf{v}$ for all $\mathbf{v}\in V$ and scalars $\alpha, \beta$ **Proposition 1.1.2 **  $\mathbb{C}^n$ is a vector space with scalars in $\mathbb{C}$. **Definition 1.1.3 **  A **subspace** of a vector space $V$ is a subset $S\subset V$ that has additive closure and multiplicative closure. **Definition 1.1.4 **  A **linear combination** of a subset of vectors $\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v}_n\}\subset V$ for scalars $\{a_1,a_2,\dots,a_n\}$ is the sum \[a_1\mathbf{v}_1+a_2\mathbf{v}_2 + \dots + a_n\mathbf{v}_n = \sum_{i=1}^na_i\mathbf{v}_i\] **Definition 1.1.5 **  A **linearly independent** set of vectors $\{\mathbf{v}_i\}$ is a set of vectors such that only trivial linear combinations produce the zero vector $\mathbf{0}$. **Definition 1.1.6 **  A **linearly dependent** set of vectors $\{\mathbf{v}_i\}$ is a set of vectors such that there exists a non-trivial linear combination. **Definition 1.1.7 **  The **span** of a set of vectors is the set of all possible linear combinations. **Definition 1.1.8 **  A **basis** of a vector space $V$ is a set of vectors with span $V$ that is linearly independent. **Definition 1.1.9 **  The **dimension** denoted $\text{Dim}(V)$ of a vector space $V$ is the number of vectors in any basis of $V$. **Definition 1.1.10 **  The **inner product** denoted $\mathbf{v}\cdot\mathbf{w}\in\mathbb{C}$ of two vectors $\mathbf{v},\mathbf{w}\in V$ is the sum of the product of there elements. \[\mathbf{v}\cdot\mathbf{w} = \langle \mathbf{v},\mathbf{w} \rangle = \mathbf{v}^*\mathbf{w} = \sum_i{\bar{v_i}w_i} \in \mathbb{C}\] **Definition 1.1.11 **  The **outer product** denoted $\mathbf{v}\mathbf{w}^*\in\mathbb{C}^{n\times n}$ of two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^{n}$ is the matrix of products of there elements. \[\mathbf{v}\mathbf{w}^* = (v_iw_j) \in \mathbb{C}^{n\times n}\] **Definition 1.1.12 **  The **elementary vectors** denoted $\mathbf{e}_i\in\mathbb{C}^n$ are the vectors with $e_i = 1$ and zeros for all other elements. ### 1.2 Matrices **Definition 1.2.1 **  A **linear transformation** is a function $T:V\to W$ for vector fields $V,W$ such that 1. $T(\mathbf{v}+\mathbf{w}) = T(\mathbf{v})+ T(\mathbf{w})$ for all $\mathbf{v},\mathbf{w}\in V$ 2. $T(\alpha\mathbf{v}) = \alpha T(\mathbf{v})$ for all $\mathbf{v}\in V$ and scalars $\alpha$ **Definition 1.2.2 **  A **matrix** is an grid of entries denoted as a rectangular array, a series of column vectors or a series of row vectors. **Definition 1.2.3 **  **Matrix-vector multiplication** denoted $M\mathbf{v}\in V_m$ for a matrix $M=(\mathbf{t}_1,\mathbf{t}_2,\dots,\mathbf{t}_n)\in M_{m\times n}$ and a vector $\mathbf{v}\in V_n$ is \[M\mathbf{v} = \sum_{i=1}^{n}{v_i \mathbf{m}_i}\quad\text{where } \mathbf{m}_i \text{ are the column vectors of }T\] **Definition 1.2.4 **  **Matrix multiplication** denoted $AB\in M_{m\times n}$ for two matrices $A \in M_{m\times k}$ and $B\in M_{k\times n}$ is the matrix where the elements are the dot products of the row vectors of $A$ and the column vectors of $B$. Let $\mathbf{a}_i$ be the row vectors of $A$ and $\mathbf{b}_j$ be the column vectors of $B$, then the elements of $AB$ matrix are \[AB_{i,j} = (\mathbf{a}_i\cdot\mathbf{b}_j)\] **Proposition 1.2.5 **  Any matrix $M\in \mathbb{C}^{m\times n}$ defines a linear transformation $M:\mathbb{C}^n\to\mathbb{C}^m$ defined by matrix vector multiplication. **Proposition 1.2.6 **  Matrix multiplication is the same as function composition of the corresponding linear transformations. **Definition 1.2.7 **  The **conjugate** denoted $\overline{M}$ of a matrix $M=(m_{i,j})\in\mathbb{C}^{m\times n}$ is the matrix with complex conjugate elements of $M$. \[\overline{M} = (\bar{m}_{i,j})\] **Definition 1.2.8 **  The **transpose** denoted $M^T$ of a matrix $M=(m_{i,j})\in\mathbb{C}^{m\times n}$ is the matrix with swapped rows and columns of $M$. \[M^T = (m_{j,i})\] **Definition 1.2.9 **  The **adjoint** denoted $M^*$ of a matrix $M=(m_{ij})\in\mathbb{C}^{n\times m}$ is conjugate transpose of $M$. \[M^* = \overline{M}^T = (\bar{m}_{ji})\] **Definition 1.2.10 **  The **inner product** denoted $\mathbf{v}\cdot\mathbf{w}\in\mathbb{C}$ of two vectors $\mathbf{v},\mathbf{w}\in V$ is the sum of the product of there elements. \[\mathbf{v}\cdot\mathbf{w} = \langle \mathbf{v},\mathbf{w} \rangle = \mathbf{v}^*\mathbf{w} = \sum_i{\bar{v_i}w_i} \in \mathbb{C}\] **Definition 1.2.11 **  The **outer product** denoted $\mathbf{v}\mathbf{w}^*\in\mathbb{C}^{n\times n}$ of two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^{n}$ is the matrix of products of there elements. \[\mathbf{v}\mathbf{w}^* = (v_iw_j) \in \mathbb{C}^{n\times n}\] **Definition 1.2.12 **  The **elementary vectors** denoted $\mathbf{e}_i\in\mathbb{C}^n$ are the vectors with $e_i = 1$ and zeros for all other elements. **Definition 1.2.13 **  The **identity matrix** is the matrix $I\in\mathbb{C}^{n\times n}$ such that $\mathbf{v} = I\mathbf{v}$ for all $\mathbf{v}\in \mathbb{C}^n$. \[I = (\mathbf{e}_1,\mathbf{e}_2,\dots,\mathbf{e}_n)\] **Definition 1.2.14 **  The **Kronecker delta** $\delta_{ij} = \left\{\begin{array}{lr} 1, & \text{if } i = j\\ 0, & \text{if } i\neq j \end{array}\right\}$. **Definition 1.2.15 **  The **kernel** or **null space** denoted $\text{Ker}(M)$ of a matrix $M$ is the set of vectors $\{ \mathbf{v}\in V : M\mathbf{v} = \mathbf{0} \}$ that map to zero. **Definition 1.2.16 **  The **nullity** denoted $\text{Nullity}(M)$ of a matrix $M$ is the dimension of the kernel. **Definition 1.2.17 **  The **range** denoted $\text{Ran}(M)$ of a matrix $M$ is the set of vectors $\{ \mathbf{w}\in W : \exists \mathbf{v}\in V\text{ st. }M\mathbf{v}=\mathbf{w} \}$ in the span of the columns of $M$. **Definition 1.2.18 **  The **rank** denoted $\text{Rank}(M)$ of a matrix $M$ is the dimension of the range. **Proposition 1.2.19 **  The kernel and range of a matrix are subspaces of there respective vector spaces. **Theorem 1.2.20 **  The **rank nullity theorem** states that the sum of the rank and the nullity is equal to the number of columns. \[\text{Rank}(M)+\text{Nullity}(M) = \text{ # of columns of M}\] **Definition 1.2.21 **  A matrix $M$ is **invertible** iff there exists a matrix $M^{-1}$ such that $MM^{-1} = M^{-1}M = I$. **Definition 1.2.22 **  The **determinant** denoted $\text{det}(A)\in\mathbb{C}$ of a matrix $A\in\mathbb{C}^{n\times n}$ with elements $a_{ij}\in\mathbb{C}$ is \[\text{det}(A) = \sum_{\sigma\in\text{Perm}(\{1,2,\dots, n\})} a_{\sigma(1),1}a_{\sigma(2),2},\dots a_{\sigma(n),n}(-1)^{K(\sigma)}\] **Theorem 1.2.23 **  For any $A\in\mathbb{C}^{n\times n}$, the following are equivalent: 1. $A$ is invertible. 2. $\text{Rank}(A) = n$ 3. $\text{Range}(A) = \mathbb{C}^n$ 4. $\text{Ker}(A) = \emptyset$ 5. $\text{det}(A)\neq 0$ **Definition 1.3 **  A **linear transformation** is a function $T:V\to W$ for vector fields $V,W$ such that 1. $T(\mathbf{v}+\mathbf{w}) = T(\mathbf{v})+ T(\mathbf{w})$ for all $\mathbf{v},\mathbf{w}\in V$ 2. $T(\alpha\mathbf{v}) = \alpha T(\mathbf{v})$ for all $\mathbf{v}\in V$ and scalars $\alpha$ ### 1.4 Orthogonal and Orthonormal Vectors **Definition 1.4.1 **  The **2-norm** denoted $||\mathbf{v}||$ of a vector $\mathbf{v}\in\mathbb{C}^n$ is the real number $||\mathbf{v}|| = \sqrt{\langle \mathbf{v}, \mathbf{v}\rangle}$ **Definition 1.4.2 **  Two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^n$ are **orthogonal** iff $\langle \mathbf{v},\mathbf{w}\rangle = 0$. **Definition 1.4.3 **  Two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^n$ are **orthonormal** iff they are orthogonal and $||\mathbf{v}||=||\mathbf{w}||=1$. **Definition 1.4.4 **  The **distance** between two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^n$ is the 2-norm of their difference $||\mathbf{v}-\mathbf{w}||$. **Definition 1.4.5 **  The **angle** between two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^n$ is $\arccos\frac{\langle \mathbf{v},\mathbf{w} \rangle}{||\mathbf{v}||\cdot||\mathbf{w}||}$. **Definition 1.4.6 **  The **orthogonal subspace** denoted $W^{\perp}\subset \mathbb{C}^n$ of a subset $W\subset\mathbb{C}^n$ is the subspace of vectors that are orthogonal to $W$. **Proposition 1.4.7 **  Orthogonal vectors are linearly independent. **Proposition 1.4.8 **  If $\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v})n\}\subset\mathbb{C}^n$ is orthonormal then any vector $\mathbf{v}\in\mathbb{C}^n$ can be written as $\mathbf{v} = \langle \mathbf{v}_1, \mathbf{v}\rangle\mathbf{v}_1 + \langle \mathbf{v}_2, \mathbf{v}\rangle\mathbf{v}_2 + \dots + \langle \mathbf{v}_n, \mathbf{v}\rangle\mathbf{v}_n$. **Proposition 1.4.9 **  If $\{\mathbf{v}_1,\mathbf{v}_2,\dots\}$ is a basis of a subspace $W\subset \mathbb{C}^n$ and $\{\mathbf{w}_1,\mathbf{w}_2,\dots\}$ is a basis of $W^\perp\subset \mathbb{C}^n$, then $\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{w}_1,\mathbf{w}_2,\dots\}$ is a basis of a $\mathbb{C}^n$ and $\dim(W)+\dim(W^{\perp}) = n$. **Proposition 1.4.10 **  For any matrix $A\in\mathbb{C}^{m\times n}$, 1. $\text{Ker}(A)^\perp = \text{Ran}(A^*)$ 2. $\text{Rang}(A)^\perp = \text{Ken}(A^*)$ ### 1.5 Hermitian and Unitary Matrices **Definition 1.5.1 **  A matrix $A\in\mathbb{C}^{n\times n}$ is **Hermitian** iff $A^*=A$. **Definition 1.5.2 **  A matrix $A\in\mathbb{C}^{n\times n}$ is **unitary** iff $A^*A=AA^*=I$. **Proposition 1.5.3 **  If $Q\in\mathbb{C}^{n\times n}$ is unitary, then 1. $||Q\mathbf{u}||=||\mathbf{u}||, \quad\forall \mathbf{u}\in\mathbb{C}^n$. 2. The rows and columns of $Q$ are an orthonormal basis of $\mathbb{C}^n$. **Theorem 1.5.4 **  **Eigenvalue Decomposition of Hermitian Matrices Theorem** states that any Hermitian matrix $A\in\mathbb{C}^{n\times n}$ can be diagonalized by an Hermitian matrix $Q\in\mathbb{C}^{n\times n}$ that is \[A = Q\begin{pmatrix}\lambda_1 && \ &&\ \\ \ && \ddots&&\ \\\ &&\ &&\lambda_n\end{pmatrix}Q^*\] where $\lambda_1,\dots,\lambda_n$ are the eigenvalues of $A$ and the columns of $Q$ are eigenvectors of $A$. ### 1.6 Singular Value Decomposition **Definition 1.6.1 **  A **singular value** of a matrix $A\in\mathbb{C}^{n\times m}$ is a real number $\sigma > 0$ such that $\sigma^2$ is an eigenvalue of $A^*A$ or $AA^*$. **Theorem 1.6.2 **  **Full Singular Value Decomposition Theorem** states that for any $A\in\mathbb{C}^{n\times m}$, there exists unitary $U\in\mathbb{C}^{n\times n}$, unitary $V\in\mathbb{C}^{m\times m}$ and semi-diagonal $\Sigma\in\mathbb{C}^{n\times m}$ such that \[A = U\Sigma V^*\] where the first $r$ diagonal elements of $\Sigma$ are the singular values $\sigma_1,\dots,\sigma_r$ of $A$ and the remaining elements of $\Sigma$ are zero. **Theorem 1.6.3 **  **Reduced Singular Value Decomposition Theorem** states that for any $A\in\mathbb{C}^{n\times m}$, there exists $U\in\mathbb{C}^{n\times r}$, unitary $V\in\mathbb{C}^{r\times m}$ and diagonal $\Sigma\in\mathbb{C}^{r\times r}$ such that \[A = U\Sigma V^*\] where the diagonal elements of $\Sigma$ are the singular values $\sigma_1,\dots,\sigma_r$ of $A$. **Proposition 1.6.4 **  For any matrix $A\in\mathbb{C}^{n\times m}$, the eigenvalue decomposition of $A^*A = U(\Sigma^*\Sigma)U^*$ and the eigenvalue decomposition $AA^*=V(\Sigma\Sigma^*)V^*$ determine the singular value decomposition $A=U\Sigma V^*$. **Proposition 1.6.5 **  The number of singular values of a matrix is the rank. **Proposition 1.6.6 **  For any matrix $A\in\mathbb{C}^{n\times m}$ with full singular value decomposition $A=U\Sigma V^*$ and any vector $x\in\mathbb{C}^m$ \[Ax = \sum_{j=1}^r{\sigma_j\langle v_j,x\rangle}u_j\] where $\sigma_1,\dots,\sigma_r$ are the singular values of $A$, $v_j$ are the columns of $V$ and $u_j$ are the columns of $U$. ### 1.7 Norms **Definition 1.7.1 **  A **norm** is a function $||\cdot||:V\to[0,\infty)$ with the following properties: 1. *Positive definiteness*: $||v||\geq0$, $\forall v\in V$ 2. *Scaling*: $||cv||=|c||||v|$, $\forall v\in V$ and scalars $c$ 3. *Trianglular inequality*: $||u+v||\leq ||u||+||v||$, $\forall u,v\in V$ **Definition 1.7.2 **  The **p-norm** denoted $||\mathbf{v}||_p$ of a vector $\mathbf{v}\in\mathbb{C}^n$ for $1\leq p <\infty$ is the real number $||\mathbf{v}||_p = \left(\sum_{i=1}^n{|v_i^p|}\right)^{\frac{1}{p}}$. **Definition 1.7.3 **  The **$\infty$-norm** denoted $||\mathbf{v}||_\infty$ of a vector $\mathbf{v}\in\mathbb{C}^n$ is the real number $||\mathbf{v}||_\infty = \max_{1\leq j \leq n}|v_j|$. **1.7.4 **  **Proposition 1.7.5 **  **Holder's Inequality** states that for $1\leq p, q < \infty$ such that $\frac{1}{p}+\frac{1}{q}=1$, then for any $x,y\in\mathbb{C}^n$, \[|x\cdot y|\leq||x||_p||y||_q\] ### 1.8 Matrix Norms **Definition 1.8.1 **  The **matrix norm** denoted $||A||$ of a matrix $A\in\mathbb{C}^{m\times n}$ induced by a vector norm $||\cdot||$ is the real number $||A|| = \max_{x\in \mathbb{C}^m/\{0\}}\frac{||Ax||}{||x||}=\max_{||x||=1}||Ax||$ **Definition 1.8.2 **  The **matrix p-q-norm** denoted $||A||_{p,q}$ of a matrix $A\in\mathbb{C}^{m\times n}$ for $1\leq p,q \leq \infty$ is the real number $||A||_{p,q} = \max_{x\in \mathbb{C}^m/\{0\}}\frac{||Ax||_p}{||x||_q}=\max_{||x||_q=1}||Ax||_p$ **Proposition 1.8.3 **  For two matrices $A\in\mathbb{C}^{m\times k}$ and $B\in\mathbb{C}^{k\times n}$, the following inequality holds for any $1\leq p,q,r\leq \infty$. \[||AB||_{p,r}\leq ||A||_{p,q}||B||_{q,r}\] **Proposition 1.8.4 **  The matrix $1$-norm is the max of the column sums. For any matrix $A\in\mathbb{C}^{m\times n}$ with column vectors $\{\mathbf{c}_1,\dots,\mathbf{c}_n\}$, $||A||_1 = \max_{j\in \{1,\dots,n\}}||\mathbf{c}_j||_1$. **Proposition 1.8.5 **  Matrix multiplication by unitary matrices preserves $2$-norms. For any $A\in\mathbb{C}^{m\times n}$ and unitary $U\in\mathbb{C}^{m\times m}$, $V\in\mathbb{C}^{n\times n}$, \[||UA||_2 = ||AV||_2 = ||A||_2\] **Definition 1.8.6 **  The **Frobenius norm** denoted $||A||_F$ of a matrix $A\in\mathbb{C}^{m\times n}$ is the real number $||A||_F = \sqrt{\text{Tr}(A^*A)} = \sqrt{\sum_{i=1}^m{\sum_{j=1}^n{|A_{i,j}|^2}}}$. **Proposition 1.8.7 **  Matrix multiplication by unitary matrices preserves Frobenius norms. For any $A\in\mathbb{C}^{m\times n}$ and unitary $U\in\mathbb{C}^{m\times m}$, $V\in\mathbb{C}^{n\times n}$, \[||UA||_F = ||AV||_F = ||A||_F\] **Corollary 1.8.8 **  For $A\in\mathbb{C}^{m\times n}$ with singular values $\sigma_1,\dots,\sigma_r$, $||A||_F = \sqrt{\sigma_1^2+\dots+\sigma_r^2}$ **1.8.9 **  ### 1.9 Orthogonal Projectors **Definition 1.9.1 **  The **projection** denoted $\text{Proj}_W(\mathbf{v})\in W$ of a vector $\mathbf{v}\in\mathbb{C}^n$ onto a subspace $W\subset \mathbb{C}^n$ is defined for an orthonormal basis $\{\mathbf{u}_1,\dots\mathbf{u}_r\}$ of $W$ by \[\text{Proj}_W(\mathbf{v}) = \langle \mathbf{u}_1,\mathbf{v} \rangle \mathbf{v} + \dots + \langle \mathbf{u}_r,\mathbf{v} \rangle \mathbf{v} \] **Proposition 1.9.2 **  The projections $\text{Proj}_{W}(v),\text{Proj}_{W^\perp}(v)$ of $\mathbf{v}\in \mathbb{C}^n$ onto $W,W^\perp\subset\mathbb{C}^n$ are orthogonal. **Definition 1.9.3 **  An **orthogonal projector matrix** is a matrix $P\in\mathbb{C}^{n\times n}$ such that $P^2 = P^* = P$. **Proposition 1.9.4 **  If $P$ is an orthogonal projector, then $I-P$ is also an orthogonal projector. **Proposition 1.9.5 **  The product $\mathbf{u}\mathbf{u}^*$ for $\mathbf{u}\in\mathbb{C}^n$ such that $||\mathbf{u}||_2=1$ is an orthogonal projector that projects onto the span of $\mathbf{u}$. **Proposition 1.9.6 **  If $W\subset \mathbb{C}^n$ is a subspace with an orthonormal basis $\{\mathbf{u}_1,\dots,\mathbf{u}_k\}$, then the matrix $P=\mathbf{u}_1\mathbf{u}_1^*+\dots+\mathbf{u}_k\mathbf{u}_k^*$ is the orthogonal projector $\text{Proj}_W$ and $\text{Ran}(P)=W$. **Proposition 1.9.7 **  If $P\mathbf{v}=\text{Proj}_W(\mathbf{v})\ \forall \mathbf{v}\in\mathbb{C}^n$, then $(I-P)\mathbf{v} = \text{Proj}_{W^\perp}(\mathbf{v})$. **Proposition 1.9.8 **  If $P\mathbf{v}=\text{Proj}_W(\mathbf{v})\ \forall \mathbf{v}\in\mathbb{C}^n$, then $(I-P)\mathbf{v} = \text{Proj}_{W^\perp}(\mathbf{v})$. ## 2 Matrix Decomposition ### 2.1 QR Decomposition **Definition 2.1.1 **  The **full QR decomposition** of a matrix $A\in\mathbb{C}^{m\times n}$ is the unitary matrix $Q\in\mathbb{C}^{m\times m}$ and the upper triangular matrix $R\in\mathbb{C}^{m\times n}$ such that \[A = QR\] **Definition 2.1.2 **  The **reduced QR decomposition** of a matrix $A\in\mathbb{C}^{m\times n}$ is the unitary matrix $Q\in\mathbb{C}^{m\times n}$ and the upper triangular matrix $R\in\mathbb{C}^{n\times n}$ such that \[A = QR\] **Theorem 2.1.3 **  For any $A\in\mathbb{C}^{m\times n}$ with $\text{Rank}(A)=n$, there exists a unique QR decomposition with $r_{i,i}>0$ for $i\in\{1,\dots,n\}$. If $\{a_1,\dots,a_n\}$ are the linearly independent columns of $A$, $\{q_1,\dots,q_n\}$ is an orthonormal basis such that $\text{span}\{q_a,\dots,q_k\} = \text{span}\{a_1,\dots,a_k\}$, for all $k\in\{1,\dots,n\}$. Then the matrix $R$ with elements, \[r_{i,j} = \begin{cases}\langle q_i, a_j \rangle & i \leq j \\ 0 & i>j\end{cases}\] and the matrix $Q$ with columns $\{q_1,\dots,q_n\}$ is the unique QR decomposition of $A$. **Proposition 2.1.4 **  For any orthonormal set of vectors $\{q_1,\dots,q_n\}$ \[\text{Proj}_{\text{span}^\perp\{q_1,\dots,q_n\}} = \text{Proj}_{\text{span}^\perp\{q_1\}}(\text{Proj}_{\text{span}^\perp\{q_2\}}((\dots))\] **Algorithm 2.1.5 **  The **classical Gram-Schmidt algorithm** can be used to calculate the $QR$ decomposition of a matrix $A$. **Algorithm 2.1.6 **  The **modified Gram-Schmidt algorithm** can be used to calculate an orthonormal basis $\{q_1,\dots,q_n\}$ from a set of vectors $\{a_1,\dots,a_n\}$ as well as the not normalized basis $\{v_1,\dots,v_n\}$. This algorithm is slightly modified to avoid numerical errors. **Proposition 2.1.7 **  For any orthonormal set of vectors $\{q_1,\dots,q_n\}$ \[\text{Proj}_{\text{span}^\perp\{q_1,\dots,q_n\}} = \text{Proj}_{\text{span}^\perp\{q_1\}}(\text{Proj}_{\text{span}^\perp\{q_2\}}((\dots))\] ### 2.2 Householder QR Decomposition **Definition 2.2.1 **  The **Householder QR decomposition** of a matrix $A\in\mathbb{C}^{m\times n}$ is a set of unitary matrices $\{Q_n,\dots,Q_1\}\subset\mathbb{C}^{m\times m}$ such that the matrix $R\in\mathbb{C}^{m\times n}$ defined below is upper triangular. \[R = Q_n\dots Q_1A\] **Definition 2.2.2 **  The **Householder reflector** $H_v\in\mathbb{C}^{n\times n}$ for a unit vector $v\in\mathbb{C}^n$ is the following matrix. \[H_v = I-2vv^*\] **Proposition 2.2.3 **  For $x,y\in\mathbb{R}^n$ with $||x||=||y||$, and $x\neq y$, the Householder reflector $H_v$ for $v=\frac{x-y}{||x-y||}$ maps $x$ to $y$. \[H_vx=y\] **Proposition 2.2.4 **  Each $A\in\mathbb{R}^{m\times n}$ has a QR decomposition. **Algorithm 2.2.5 **  The **Householder QR factorization algorithm** can be used to calculate the $QR$ decomposition for a matrix $A=QR$. The following algorithm computes $R$ by leaving the result in place of $A$ and then computes the columns of $Q$ by applying the Householder transformation for each of the standard basis vectors. ### 2.3 LU Decomposition **Definition 2.3.1 **  The **LU decomposition** of a square matrix $A\in\mathbb{C}^{n\times n}$ is the lower triangular matrix $L\in\mathbb{C}^{n\times n}$ and the upper triangular matrix $U\in\mathbb{C}^{n\times n}$ such that \[A=LU\] **Proposition 2.3.2 **  *Gaussian elimination* can be used to construct the matrices $L\in\mathbb{C}^{n\times n}$ and $U\in\mathbb{C}^{n\times n}$ in terms of a set of lower triangular matrices $\{L_1^{-1},\dots,L_n^{-1}\}\subset\mathbb{C}^{n\times n}$ such that \[A = LU = Lb v **Proposition 2.3.3 **  Let $L_k\in\mathbb{n\times n}$ be the row operation matrix defined by \[L_{k} = \begin{pmatrix} 1&\ &\ &\ &\ &\ \\ \ &\ddots&\ &\ &\ &\ \\ \ &\ &1&\ &\ &\ \\ \ &\ &-\ell_{k+1,k}&\ddots &\ &\ \\ \ &\ &\vdots&\ &\ddots &\ \\ \ &\ &-\ell_{n,k}&\ &\ &1 \end{pmatrix}\] where $\ell_{j,k} = \frac{a_{j,k}}{a_{k,k}}$ for $j = k+1,\dots,n$. The inverse $L_k^{-1}$ is the matrix \[L_{k} = \begin{pmatrix} 1&\ &\ &\ &\ &\ \\ \ &\ddots&\ &\ &\ &\ \\ \ &\ &1&\ &\ &\ \\ \ &\ &\ell_{k+1,k}&\ddots &\ &\ \\ \ &\ &\vdots&\ &\ddots &\ \\ \ &\ &\ell_{n,k}&\ &\ &1 \end{pmatrix}\] **Algorithm 2.3.4 **  The **LU decomposition algorithm without pivoting** can be used to compute the LU decomposition of a matrix $A$ by using Gaussian elimination. **Algorithm 2.3.5 **  The **LU decomposition algorithm with partial pivoting** can be used to compute the LU decomposition of a matrix $A$ by using Gaussian elimination in a more stable way than without pivoting. ### 2.4 Cholesky Decomposition **Definition 2.4.1 **  A **positive definite matrix** $A\in\mathbb{C}^{n\times n}$ is a Hermitian matrix such that $x^*Ax\geq 0,\ \forall x\in\mathbb{C}^n$ and "$=$" holds only when $x=0$. **Proposition 2.4.2 **  If a matrix $A\in\mathbb{C}^{n\times n}$ is positive definite, then the block diagonal sub-matrices are also positive definite. **Definition 2.4.3 **  The **Cholesky decomposition** of a positive definite matrix $A\in\mathbb{C}^{n\times n}$ is an upper triangular matrix $R\in\mathbb{C}^{n\times n}$ with positive diagonal elements such that \[ A = R^*R\] **Theorem 2.4.4 **  Any positive definite matrix has a Cholesky decomposition. **Algorithm 2.4.5 **  The **Cholesky decomposition algorithm** can be used to compute the Cholesky decomposition of a matrix $A\in\mathbb{C}^{n\times n}$. ## 3 Eigenvalue Problems ### 3.1 Rayleigh Quotient and Inverse Iteration **Proposition 3.1.1 **  Let $|\lambda_1| > |\lambda_2| > \dots > |\lambda_n|$ be the eigenvalues of a real matrix $A\in\mathbb{R}^{n\times n}$ with corresponding eigenbasis $q_1,q_2,\dots,q_n\in\mathbb{R}^n$. If $v^{(0)}\in\mathbb{R}^n$ is some vector such that $\langle q, v^{(0)}\rangle \neq 0$, then \[v^{(k)} = \frac{A^kv^{(0)}}{||A^kv^{(0)}||} \to \pm q_1\quad \text{ as } k\to \infty\] **Definition 3.1.2 **  The **Rayleigh quotient** of a symmetric matrix $A\in\mathbb{R}^{n\times n}$ is the function $r_A:\mathbb{R}^{n} - \{0\} \to \mathbb{R}$ defined by $r_A(x) = \frac{x^T Ax}{||x||^2}$. **Proposition 3.1.3 **  A nonzero vector $x\in\mathbb{R}^n$ is an eigenvector of a symmetric matrix $A\in\mathbb{R}^{n\times n}$ if and only if $x$ is a critical point of $r_A(x)$. **Algorithm 3.1.4 **  The **power iteration algorithm** can be used to approximate the largest absolute eigenvalue and corresponding eigenbases vector of a matrix $A$. **Algorithm 3.1.5 **  The **Inverse iteration algorithm** can be used to **Algorithm 3.1.6 **  The **Rayleigh quotient iteration algorithm** can be used to approximate all the eigenvalue and corresponding eigenbases vectors of a matrix $A$. ### 3.2 Schur Decomposition **Definition 3.2.1 **  A **similar matrix** $A\in\mathbb{C}^{n\times n}$ to another matrix $B\in\mathbb{C}^{n\times n}$ is a matrix where there exists a non-singular matrix $S\in\mathbb{C}^{n\times n}$ such that $A = S^{-1}BS$. **Definition 3.2.2 **  The **singularity transformation** for a non-singular matrix $S\in\mathbb{C}^{n\times n}$ is the mapping $\text{Sim}_S : \mathbb{C}^{n\times n} \to \mathbb{C}^{n\times n}$ defined by \[\text{Sim}_S(A) = S^{-1}AS\] **Definition 3.2.3 **  A matrix $A\in\mathbb{C}^{n\times n}$ is **diagonalizable** iff there exists a non-singular $S\in\mathbb{C}^{n\times n}$ such that $S^{-1}AS$ is a diagonal matrix. **Definition 3.2.4 **  A matrix $A\in\mathbb{C}^{n\times n}$ is **unitary diagonalizable** iff there exists a unitary $U\in\mathbb{C}^{n\times n}$ such that $U^* AU$ is a diagonal matrix. **Definition 3.2.5 **  A matrix $A\in\mathbb{C}^{n\times n}$ is **unitary triangularizable** iff there exists a unitary $U\in\mathbb{C}^{n\times n}$ such that $U^* AU$ is upper triangular matrix. **Definition 3.2.6 **  The **Schur decomposition** of $A\in\mathbb{C}^{n\times n}$ is a unitary Q and upper triangular T such that \[A = QTQ^*\] **Proposition 3.2.7 **  A matrix is diagonalizable iff A has n eignenvectors that form a basis of $\mathbb{C}^n$. **Proposition 3.2.8 **  A matrix is unitary diagonalizable iff $A^*A = AA^*$ **Proposition 3.2.9 **  Any matrix $A\in\mathbb{C}^{n\times n}$ has a Schur decomposition $A = QTA^*$. **Proposition 3.2.10 **  For $A\in\mathbb{C}^{n\times n}$ let $\{A_k\}$ be the sequence of matrices defined by $A_0= A$, $A_{k-1} = Q_kR_k$, and $A_k = R_k Q_k$, then 1. $A_k$ has the same eigenvalues of $A$. 2. $A^{k} = (Q_1\dots Q_k)(R_k\dots R_1)$. **Theorem 3.2.11 **  If $A\in\mathbb{C}^{n\times n}$ is a matrix with eigenvalues $\lambda_1,\dots,\lambda_n$ such that $|\lambda_1| > \dots > |\lambda_n| > 0$, then the sequence of matrices $\{T_k\}$ defined by $T_0= A$, $T_{k-1} = Q_kR_k$, and $T_k = R_k Q_k$ converges to $T$ and $Q_1Q_2\dots$ converges to $Q$ in the Schur decomposition $A = QTQ^*$. **Algorithm 3.2.12 **  The **pure QR Schur decomposition algorithm** can be used to iteratively compute the Schur decomposition of a matrix $A=QTQ^*$. **Def 3.2.13 **  An matrix $H\in\mathbb{C}^{n\times n}$ is **upper Hessenberg** iff $H_{i,j}=0$ for $i>j+1$. **Proposition 3.2.14 **  All square matrices are unitarily similar to an upper hessenber matrix. **Algorithm 3.2.15 **  The **practical QR Schur decomposition algorithm** can be used to iteratively compute the Schur decomposition of a matrix $A=QTQ^*$ more efficiently. ### 3.3 Arnoldi and Lanczos Iteration **Definition 3.3.1 **  the **Krylov subspace of order r** generated by $A\in\mathbb{C}^{n\times n}$ and $b\in\mathbb{C}^n$ is the subspace $\mathcal{K}_r(A,b)$ defined by \[\mathcal{K}_r(A,b) = \text{span}\{b,Ab,A^2b,\dots,A^{r-1}b\}\] **Definition 3.3.2 **  The **Krylov matrix of order r** generated by $A\in\mathbb{C}^{n\times n}$ and $b\in\mathbb{C}^n$ is the matrix $K_r(A,b)$ defined by \[K_r(A,b) = \text{matrix with columns }(b,Ab,A^2b,\dots,A^{r-1}b)\] **Definition 3.3.3 **  A **monic polynomial of degree d** is a function $p:\mathbb{C}\to\mathbb{C}$ of the form \[p(t) = t^d + c_{d-1}t^{d-1} + \dots c_{1}t^{1} + c_0\] **Definition 3.3.4 **  The **minimal polynomial of A with respect to b** is the non-zero monic polynomial of lowest degree such that $p(A)b=0$. **Proposition 3.3.5 **  Let $m$ be the degree of the minimal polynomial of $A$ with respect to $b$, then - $\{b,Ab,\dots,A^{r-1}b\}$ is linearly independent for $r\leq m$. - $A^mb,A^{m+1}b,\dots \in \mathcal{K}_m(A,b)$. **Algorithm 3.3.6 **  The **Arnoldi iteration algorithm** can be used to find an orthonormal eigenbasis of a matrix $A\in\mathbb{C}^{n\times n}$. **Algorithm 3.3.7 **  The **Lanczos iteration algorithm** can be used to find an orthonormal eigenbasis of a matrix $A\in\mathbb{C}^{n\times n}$. **Proposition 4 **  The projections $\text{Proj}_{W}(v),\text{Proj}_{W^\perp}(v)$ of $\mathbf{v}\in \mathbb{C}^n$ onto $W,W^\perp\subset\mathbb{C}^n$ are orthogonal.