Numerical Linear Algebra2CZ4

1Fundamental Linear AlgebraZ6PT

1.1Vector SpacesD28A

Definition 1.1.1  A Vector Space is a set $V$ equipped with addition and scalar multiplication with the following properties

  1. Commutativity: $\mathbf{v}+\mathbf{w}=\mathbf{w}+\mathbf{v}$, $\forall\ \mathbf{v},\mathbf{w}\in V$
  2. Associativity: $(\mathbf{u} + \mathbf{v}) + \mathbf{w} = \mathbf{u}+(\mathbf{v}+\mathbf{w})$, $\forall\ \mathbf{u},\mathbf{v},\mathbf{w}\in V$
  3. Zero Vector: $\exists\ \mathbf{0}\in V$ such that $\mathbf{v}+\mathbf{0}=\mathbf{v}$, $\forall\ \mathbf{v}\in V$
  4. Additive inverse: For every $\mathbf{v}\in V$, there exists $-\mathbf{v}\in V$ such that $\mathbf{v}-\mathbf{v}=\mathbf{0}$
  5. Multiplicative Identity: $1\mathbf{v} = \mathbf{v}$ for all $\mathbf{v}\in V$
  6. Multiplicative Associativity: $(\alpha\beta)\mathbf{v}$ = $\alpha(\beta\mathbf{v})$ for all $\mathbf{v}\in V$ and scalars $\alpha, \beta$
  7. Additive Closure: $\mathbf{v}+\mathbf{w}\in V$ for all $\mathbf{v},\mathbf{w}\in V$
  8. Multiplicative Closure: $\alpha\mathbf{v}\in V$ for all $\mathbf{v}\in V$ and scalars $\alpha$
  9. $\alpha(\mathbf{u}+\mathbf{v}) = \alpha\mathbf{u} + \alpha\mathbf{v}$ for all $\mathbf{u},\mathbf{v}\in V$ and scalars $\alpha$
  10. $(\alpha + \beta)\mathbf{v}=\alpha\mathbf{v}+\beta\mathbf{v}$ for all $\mathbf{v}\in V$ and scalars $\alpha, \beta$

Proposition 1.1.2  $\mathbb{C}^n$ is a vector space with scalars in $\mathbb{C}$.

Definition 1.1.3  A subspace of a vector space $V$ is a subset $S\subset V$ that has additive closure and multiplicative closure.

Definition 1.1.4  A linear combination of a subset of vectors $\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v}_n\}\subset V$ for scalars $\{a_1,a_2,\dots,a_n\}$ is the sum
\[a_1\mathbf{v}_1+a_2\mathbf{v}_2 + \dots + a_n\mathbf{v}_n = \sum_{i=1}^na_i\mathbf{v}_i\]

Definition 1.1.5  A linearly independent set of vectors $\{\mathbf{v}_i\}$ is a set of vectors such that only trivial linear combinations produce the zero vector $\mathbf{0}$.

Definition 1.1.6  A linearly dependent set of vectors $\{\mathbf{v}_i\}$ is a set of vectors such that there exists a non-trivial linear combination.

Definition 1.1.7  The span of a set of vectors is the set of all possible linear combinations.

Definition 1.1.8  A basis of a vector space $V$ is a set of vectors with span $V$ that is linearly independent.

Definition 1.1.9  The dimension denoted $\text{Dim}(V)$ of a vector space $V$ is the number of vectors in any basis of $V$.

Definition 1.1.10  The inner product denoted $\mathbf{v}\cdot\mathbf{w}\in\mathbb{C}$ of two vectors $\mathbf{v},\mathbf{w}\in V$ is the sum of the product of there elements.
\[\mathbf{v}\cdot\mathbf{w} = \langle \mathbf{v},\mathbf{w} \rangle = \mathbf{v}^*\mathbf{w} = \sum_i{\bar{v_i}w_i} \in \mathbb{C}\]

Definition 1.1.11  The outer product denoted $\mathbf{v}\mathbf{w}^*\in\mathbb{C}^{n\times n}$ of two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^{n}$ is the matrix of products of there elements.
\[\mathbf{v}\mathbf{w}^* = (v_iw_j) \in \mathbb{C}^{n\times n}\]

Definition 1.1.12  The elementary vectors denoted $\mathbf{e}_i\in\mathbb{C}^n$ are the vectors with $e_i = 1$ and zeros for all other elements.

1.2MatricesZ05T

Definition 1.2.1  A linear transformation is a function $T:V\to W$ for vector fields $V,W$ such that

  1. $T(\mathbf{v}+\mathbf{w}) = T(\mathbf{v})+ T(\mathbf{w})$ for all $\mathbf{v},\mathbf{w}\in V$
  2. $T(\alpha\mathbf{v}) = \alpha T(\mathbf{v})$ for all $\mathbf{v}\in V$ and scalars $\alpha$

Definition 1.2.2  A matrix is an grid of entries denoted as a rectangular array, a series of column vectors or a series of row vectors.

Definition 1.2.3  Matrix-vector multiplication denoted $M\mathbf{v}\in V_m$ for a matrix $M=(\mathbf{t}_1,\mathbf{t}_2,\dots,\mathbf{t}_n)\in M_{m\times n}$ and a vector $\mathbf{v}\in V_n$ is
\[M\mathbf{v} = \sum_{i=1}^{n}{v_i \mathbf{m}_i}\quad\text{where } \mathbf{m}_i \text{ are the column vectors of }T\]

Definition 1.2.4  Matrix multiplication denoted $AB\in M_{m\times n}$ for two matrices $A \in M_{m\times k}$ and $B\in M_{k\times n}$ is the matrix where the elements are the dot products of the row vectors of $A$ and the column vectors of $B$. Let $\mathbf{a}_i$ be the row vectors of $A$ and $\mathbf{b}_j$ be the column vectors of $B$, then the elements of $AB$ matrix are
\[AB_{i,j} = (\mathbf{a}_i\cdot\mathbf{b}_j)\]

Proposition 1.2.5  Any matrix $M\in \mathbb{C}^{m\times n}$ defines a linear transformation $M:\mathbb{C}^n\to\mathbb{C}^m$ defined by matrix vector multiplication.

Proposition 1.2.6  Matrix multiplication is the same as function composition of the corresponding linear transformations.

Definition 1.2.7  The conjugate denoted $\overline{M}$ of a matrix $M=(m_{i,j})\in\mathbb{C}^{m\times n}$ is the matrix with complex conjugate elements of $M$.
\[\overline{M} = (\bar{m}_{i,j})\]

Definition 1.2.8  The transpose denoted $M^T$ of a matrix $M=(m_{i,j})\in\mathbb{C}^{m\times n}$ is the matrix with swapped rows and columns of $M$.
\[M^T = (m_{j,i})\]

Definition 1.2.9  The adjoint denoted $M^*$ of a matrix $M=(m_{ij})\in\mathbb{C}^{n\times m}$ is conjugate transpose of $M$.
\[M^* = \overline{M}^T = (\bar{m}_{ji})\]

Definition 1.2.10  The inner product denoted $\mathbf{v}\cdot\mathbf{w}\in\mathbb{C}$ of two vectors $\mathbf{v},\mathbf{w}\in V$ is the sum of the product of there elements.
\[\mathbf{v}\cdot\mathbf{w} = \langle \mathbf{v},\mathbf{w} \rangle = \mathbf{v}^*\mathbf{w} = \sum_i{\bar{v_i}w_i} \in \mathbb{C}\]

Definition 1.2.11  The outer product denoted $\mathbf{v}\mathbf{w}^*\in\mathbb{C}^{n\times n}$ of two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^{n}$ is the matrix of products of there elements.
\[\mathbf{v}\mathbf{w}^* = (v_iw_j) \in \mathbb{C}^{n\times n}\]

Definition 1.2.12  The elementary vectors denoted $\mathbf{e}_i\in\mathbb{C}^n$ are the vectors with $e_i = 1$ and zeros for all other elements.

Definition 1.2.13  The identity matrix is the matrix $I\in\mathbb{C}^{n\times n}$ such that $\mathbf{v} = I\mathbf{v}$ for all $\mathbf{v}\in \mathbb{C}^n$.
\[I = (\mathbf{e}_1,\mathbf{e}_2,\dots,\mathbf{e}_n)\]

Definition 1.2.14  The Kronecker delta $\delta_{ij} = \left\{\begin{array}{lr}
1, & \text{if } i = j\\
0, & \text{if } i\neq j
\end{array}\right\}$.

Definition 1.2.15  The kernel or null space denoted $\text{Ker}(M)$ of a matrix $M$ is the set of vectors $\{ \mathbf{v}\in V : M\mathbf{v} = \mathbf{0} \}$ that map to zero.

Definition 1.2.16  The nullity denoted $\text{Nullity}(M)$ of a matrix $M$ is the dimension of the kernel.

Definition 1.2.17  The range denoted $\text{Ran}(M)$ of a matrix $M$ is the set of vectors $\{ \mathbf{w}\in W : \exists \mathbf{v}\in V\text{ st. }M\mathbf{v}=\mathbf{w} \}$ in the span of the columns of $M$.

Definition 1.2.18  The rank denoted $\text{Rank}(M)$ of a matrix $M$ is the dimension of the range.

Proposition 1.2.19  The kernel and range of a matrix are subspaces of there respective vector spaces.

Theorem 1.2.20  The rank nullity theorem states that the sum of the rank and the nullity is equal to the number of columns.
\[\text{Rank}(M)+\text{Nullity}(M) = \text{ # of columns of M}\]

Definition 1.2.21  A matrix $M$ is invertible iff there exists a matrix $M^{-1}$ such that $MM^{-1} = M^{-1}M = I$.

Definition 1.2.22  The determinant denoted $\text{det}(A)\in\mathbb{C}$ of a matrix $A\in\mathbb{C}^{n\times n}$ with elements $a_{ij}\in\mathbb{C}$ is
\[\text{det}(A) = \sum_{\sigma\in\text{Perm}(\{1,2,\dots, n\})} a_{\sigma(1),1}a_{\sigma(2),2},\dots a_{\sigma(n),n}(-1)^{K(\sigma)}\]

Theorem 1.2.23  For any $A\in\mathbb{C}^{n\times n}$, the following are equivalent:

  1. $A$ is invertible.
  2. $\text{Rank}(A) = n$
  3. $\text{Range}(A) = \mathbb{C}^n$
  4. $\text{Ker}(A) = \emptyset$
  5. $\text{det}(A)\neq 0$

Definition 1.3  A linear transformation is a function $T:V\to W$ for vector fields $V,W$ such that

  1. $T(\mathbf{v}+\mathbf{w}) = T(\mathbf{v})+ T(\mathbf{w})$ for all $\mathbf{v},\mathbf{w}\in V$
  2. $T(\alpha\mathbf{v}) = \alpha T(\mathbf{v})$ for all $\mathbf{v}\in V$ and scalars $\alpha$

1.4Orthogonal and Orthonormal VectorsP8NF

Definition 1.4.1  The 2-norm denoted $||\mathbf{v}||$ of a vector $\mathbf{v}\in\mathbb{C}^n$ is the real number $||\mathbf{v}|| = \sqrt{\langle \mathbf{v}, \mathbf{v}\rangle}$

Definition 1.4.2  Two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^n$ are orthogonal iff $\langle \mathbf{v},\mathbf{w}\rangle = 0$.

Definition 1.4.3  Two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^n$ are orthonormal iff they are orthogonal and $||\mathbf{v}||=||\mathbf{w}||=1$.

Definition 1.4.4  The distance between two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^n$ is the 2-norm of their difference $||\mathbf{v}-\mathbf{w}||$.

Definition 1.4.5  The angle between two vectors $\mathbf{v},\mathbf{w}\in\mathbb{C}^n$ is $\arccos\frac{\langle \mathbf{v},\mathbf{w} \rangle}{||\mathbf{v}||\cdot||\mathbf{w}||}$.

Definition 1.4.6  The orthogonal subspace denoted $W^{\perp}\subset \mathbb{C}^n$ of a subset $W\subset\mathbb{C}^n$ is the subspace of vectors that are orthogonal to $W$.

Proposition 1.4.7  Orthogonal vectors are linearly independent.

Proposition 1.4.8  If $\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{v})n\}\subset\mathbb{C}^n$ is orthonormal then any vector $\mathbf{v}\in\mathbb{C}^n$ can be written as $\mathbf{v} = \langle \mathbf{v}_1, \mathbf{v}\rangle\mathbf{v}_1 + \langle \mathbf{v}_2, \mathbf{v}\rangle\mathbf{v}_2 + \dots + \langle \mathbf{v}_n, \mathbf{v}\rangle\mathbf{v}_n$.

Proposition 1.4.9  If $\{\mathbf{v}_1,\mathbf{v}_2,\dots\}$ is a basis of a subspace $W\subset \mathbb{C}^n$ and $\{\mathbf{w}_1,\mathbf{w}_2,\dots\}$ is a basis of $W^\perp\subset \mathbb{C}^n$, then $\{\mathbf{v}_1,\mathbf{v}_2,\dots,\mathbf{w}_1,\mathbf{w}_2,\dots\}$ is a basis of a $\mathbb{C}^n$ and $\dim(W)+\dim(W^{\perp}) = n$.

Proposition 1.4.10  For any matrix $A\in\mathbb{C}^{m\times n}$,

  1. $\text{Ker}(A)^\perp = \text{Ran}(A^*)$
  2. $\text{Rang}(A)^\perp = \text{Ken}(A^*)$

1.5Hermitian and Unitary Matrices1JJJ

Definition 1.5.1  A matrix $A\in\mathbb{C}^{n\times n}$ is Hermitian iff $A^*=A$.

Definition 1.5.2  A matrix $A\in\mathbb{C}^{n\times n}$ is unitary iff $A^*A=AA^*=I$.

Proposition 1.5.3  If $Q\in\mathbb{C}^{n\times n}$ is unitary, then

  1. $||Q\mathbf{u}||=||\mathbf{u}||, \quad\forall \mathbf{u}\in\mathbb{C}^n$.
  2. The rows and columns of $Q$ are an orthonormal basis of $\mathbb{C}^n$.

Theorem 1.5.4  Eigenvalue Decomposition of Hermitian Matrices Theorem states that any Hermitian matrix $A\in\mathbb{C}^{n\times n}$ can be diagonalized by an Hermitian matrix $Q\in\mathbb{C}^{n\times n}$ that is
\[A = Q\begin{pmatrix}\lambda_1 && \ &&\ \\ \ && \ddots&&\ \\\ &&\ &&\lambda_n\end{pmatrix}Q^*\]
where $\lambda_1,\dots,\lambda_n$ are the eigenvalues of $A$ and the columns of $Q$ are eigenvectors of $A$.

1.6Singular Value DecompositionM6C4

Definition 1.6.1  A singular value of a matrix $A\in\mathbb{C}^{n\times m}$ is a real number $\sigma > 0$ such that $\sigma^2$ is an eigenvalue of $A^*A$ or $AA^*$.

Theorem 1.6.2  Full Singular Value Decomposition Theorem states that for any $A\in\mathbb{C}^{n\times m}$, there exists unitary $U\in\mathbb{C}^{n\times n}$, unitary $V\in\mathbb{C}^{m\times m}$ and semi-diagonal $\Sigma\in\mathbb{C}^{n\times m}$ such that
\[A = U\Sigma V^*\]
where the first $r$ diagonal elements of $\Sigma$ are the singular values $\sigma_1,\dots,\sigma_r$ of $A$ and the remaining elements of $\Sigma$ are zero.

Theorem 1.6.3  Reduced Singular Value Decomposition Theorem states that for any $A\in\mathbb{C}^{n\times m}$, there exists $U\in\mathbb{C}^{n\times r}$, unitary $V\in\mathbb{C}^{r\times m}$ and diagonal $\Sigma\in\mathbb{C}^{r\times r}$ such that
\[A = U\Sigma V^*\]
where the diagonal elements of $\Sigma$ are the singular values $\sigma_1,\dots,\sigma_r$ of $A$.

Proposition 1.6.4  For any matrix $A\in\mathbb{C}^{n\times m}$, the eigenvalue decomposition of $A^*A = U(\Sigma^*\Sigma)U^*$ and the eigenvalue decomposition $AA^*=V(\Sigma\Sigma^*)V^*$ determine the singular value decomposition $A=U\Sigma V^*$.

Proposition 1.6.5  The number of singular values of a matrix is the rank.

Proposition 1.6.6  For any matrix $A\in\mathbb{C}^{n\times m}$ with full singular value decomposition $A=U\Sigma V^*$ and any vector $x\in\mathbb{C}^m$
\[Ax = \sum_{j=1}^r{\sigma_j\langle v_j,x\rangle}u_j\]
where $\sigma_1,\dots,\sigma_r$ are the singular values of $A$, $v_j$ are the columns of $V$ and $u_j$ are the columns of $U$.

1.7NormsHNPM

Definition 1.7.1  A norm is a function $||\cdot||:V\to[0,\infty)$ with the following properties:

  1. Positive definiteness: $||v||\geq0$, $\forall v\in V$
  2. Scaling: $||cv||=|c||||v|$, $\forall v\in V$ and scalars $c$
  3. Trianglular inequality: $||u+v||\leq ||u||+||v||$, $\forall u,v\in V$

Definition 1.7.2  The p-norm denoted $||\mathbf{v}||_p$ of a vector $\mathbf{v}\in\mathbb{C}^n$ for $1\leq p <\infty$ is the real number $||\mathbf{v}||_p = \left(\sum_{i=1}^n{|v_i^p|}\right)^{\frac{1}{p}}$.

Definition 1.7.3  The $\infty$-norm denoted $||\mathbf{v}||_\infty$ of a vector $\mathbf{v}\in\mathbb{C}^n$ is the real number $||\mathbf{v}||_\infty = \max_{1\leq j \leq n}|v_j|$.

1.7.4 

Proposition 1.7.5  Holder's Inequality states that for $1\leq p, q < \infty$ such that $\frac{1}{p}+\frac{1}{q}=1$, then for any $x,y\in\mathbb{C}^n$,
\[|x\cdot y|\leq||x||_p||y||_q\]

1.8Matrix Norms1AHP

Definition 1.8.1  The matrix norm denoted $||A||$ of a matrix $A\in\mathbb{C}^{m\times n}$ induced by a vector norm $||\cdot||$ is the real number $||A|| = \max_{x\in \mathbb{C}^m/\{0\}}\frac{||Ax||}{||x||}=\max_{||x||=1}||Ax||$

Definition 1.8.2  The matrix p-q-norm denoted $||A||_{p,q}$ of a matrix $A\in\mathbb{C}^{m\times n}$ for $1\leq p,q \leq \infty$ is the real number $||A||_{p,q} = \max_{x\in \mathbb{C}^m/\{0\}}\frac{||Ax||_p}{||x||_q}=\max_{||x||_q=1}||Ax||_p$

Proposition 1.8.3  For two matrices $A\in\mathbb{C}^{m\times k}$ and $B\in\mathbb{C}^{k\times n}$, the following inequality holds for any $1\leq p,q,r\leq \infty$.
\[||AB||_{p,r}\leq ||A||_{p,q}||B||_{q,r}\]

Proposition 1.8.4  The matrix $1$-norm is the max of the column sums. For any matrix $A\in\mathbb{C}^{m\times n}$ with column vectors $\{\mathbf{c}_1,\dots,\mathbf{c}_n\}$, $||A||_1 = \max_{j\in \{1,\dots,n\}}||\mathbf{c}_j||_1$.

Proposition 1.8.5  Matrix multiplication by unitary matrices preserves $2$-norms. For any $A\in\mathbb{C}^{m\times n}$ and unitary $U\in\mathbb{C}^{m\times m}$, $V\in\mathbb{C}^{n\times n}$,
\[||UA||_2 = ||AV||_2 = ||A||_2\]

Definition 1.8.6  The Frobenius norm denoted $||A||_F$ of a matrix $A\in\mathbb{C}^{m\times n}$ is the real number $||A||_F = \sqrt{\text{Tr}(A^*A)} = \sqrt{\sum_{i=1}^m{\sum_{j=1}^n{|A_{i,j}|^2}}}$.

Proposition 1.8.7  Matrix multiplication by unitary matrices preserves Frobenius norms. For any $A\in\mathbb{C}^{m\times n}$ and unitary $U\in\mathbb{C}^{m\times m}$, $V\in\mathbb{C}^{n\times n}$,
\[||UA||_F = ||AV||_F = ||A||_F\]

Corollary 1.8.8  For $A\in\mathbb{C}^{m\times n}$ with singular values $\sigma_1,\dots,\sigma_r$, $||A||_F = \sqrt{\sigma_1^2+\dots+\sigma_r^2}$

1.8.9 

1.9Orthogonal ProjectorsFZ2N

Definition 1.9.1  The projection denoted $\text{Proj}_W(\mathbf{v})\in W$ of a vector $\mathbf{v}\in\mathbb{C}^n$ onto a subspace $W\subset \mathbb{C}^n$ is defined for an orthonormal basis $\{\mathbf{u}_1,\dots\mathbf{u}_r\}$ of $W$ by
\[\text{Proj}_W(\mathbf{v}) = \langle \mathbf{u}_1,\mathbf{v} \rangle \mathbf{v} + \dots + \langle \mathbf{u}_r,\mathbf{v} \rangle \mathbf{v} \]

Proposition 1.9.2  The projections $\text{Proj}_{W}(v),\text{Proj}_{W^\perp}(v)$ of $\mathbf{v}\in \mathbb{C}^n$ onto $W,W^\perp\subset\mathbb{C}^n$ are orthogonal.

Definition 1.9.3  An orthogonal projector matrix is a matrix $P\in\mathbb{C}^{n\times n}$ such that $P^2 = P^* = P$.

Proposition 1.9.4  If $P$ is an orthogonal projector, then $I-P$ is also an orthogonal projector.

Proposition 1.9.5  The product $\mathbf{u}\mathbf{u}^*$ for $\mathbf{u}\in\mathbb{C}^n$ such that $||\mathbf{u}||_2=1$ is an orthogonal projector that projects onto the span of $\mathbf{u}$.

Proposition 1.9.6  If $W\subset \mathbb{C}^n$ is a subspace with an orthonormal basis $\{\mathbf{u}_1,\dots,\mathbf{u}_k\}$, then the matrix $P=\mathbf{u}_1\mathbf{u}_1^*+\dots+\mathbf{u}_k\mathbf{u}_k^*$ is the orthogonal projector $\text{Proj}_W$ and $\text{Ran}(P)=W$.

Proposition 1.9.7  If $P\mathbf{v}=\text{Proj}_W(\mathbf{v})\ \forall \mathbf{v}\in\mathbb{C}^n$, then $(I-P)\mathbf{v} = \text{Proj}_{W^\perp}(\mathbf{v})$.

Proposition 1.9.8  If $P\mathbf{v}=\text{Proj}_W(\mathbf{v})\ \forall \mathbf{v}\in\mathbb{C}^n$, then $(I-P)\mathbf{v} = \text{Proj}_{W^\perp}(\mathbf{v})$.

2Matrix DecompositionKHZ6

2.1QR DecompositionT2RK

Definition 2.1.1  The full QR decomposition of a matrix $A\in\mathbb{C}^{m\times n}$ is the unitary matrix $Q\in\mathbb{C}^{m\times m}$ and the upper triangular matrix $R\in\mathbb{C}^{m\times n}$ such that
\[A = QR\]

Definition 2.1.2  The reduced QR decomposition of a matrix $A\in\mathbb{C}^{m\times n}$ is the unitary matrix $Q\in\mathbb{C}^{m\times n}$ and the upper triangular matrix $R\in\mathbb{C}^{n\times n}$ such that
\[A = QR\]

Theorem 2.1.3  For any $A\in\mathbb{C}^{m\times n}$ with $\text{Rank}(A)=n$, there exists a unique QR decomposition with $r_{i,i}>0$ for $i\in\{1,\dots,n\}$. If $\{a_1,\dots,a_n\}$ are the linearly independent columns of $A$, $\{q_1,\dots,q_n\}$ is an orthonormal basis such that $\text{span}\{q_a,\dots,q_k\} = \text{span}\{a_1,\dots,a_k\}$, for all $k\in\{1,\dots,n\}$. Then the matrix $R$ with elements,
\[r_{i,j} = \begin{cases}\langle q_i, a_j \rangle & i \leq j \\ 0 & i>j\end{cases}\]
and the matrix $Q$ with columns $\{q_1,\dots,q_n\}$ is the unique QR decomposition of $A$.

Proposition 2.1.4  For any orthonormal set of vectors $\{q_1,\dots,q_n\}$
\[\text{Proj}_{\text{span}^\perp\{q_1,\dots,q_n\}} = \text{Proj}_{\text{span}^\perp\{q_1\}}(\text{Proj}_{\text{span}^\perp\{q_2\}}((\dots))\]

Algorithm 2.1.5  The classical Gram-Schmidt algorithm can be used to calculate the $QR$ decomposition of a matrix $A$.

Algorithm 2.1.6  The modified Gram-Schmidt algorithm can be used to calculate an orthonormal basis $\{q_1,\dots,q_n\}$ from a set of vectors $\{a_1,\dots,a_n\}$ as well as the not normalized basis $\{v_1,\dots,v_n\}$. This algorithm is slightly modified to avoid numerical errors.

Proposition 2.1.7  For any orthonormal set of vectors $\{q_1,\dots,q_n\}$
\[\text{Proj}_{\text{span}^\perp\{q_1,\dots,q_n\}} = \text{Proj}_{\text{span}^\perp\{q_1\}}(\text{Proj}_{\text{span}^\perp\{q_2\}}((\dots))\]

2.2Householder QR Decomposition7PZ8

Definition 2.2.1  The Householder QR decomposition of a matrix $A\in\mathbb{C}^{m\times n}$ is a set of unitary matrices $\{Q_n,\dots,Q_1\}\subset\mathbb{C}^{m\times m}$ such that the matrix $R\in\mathbb{C}^{m\times n}$ defined below is upper triangular.
\[R = Q_n\dots Q_1A\]

Definition 2.2.2  The Householder reflector $H_v\in\mathbb{C}^{n\times n}$ for a unit vector $v\in\mathbb{C}^n$ is the following matrix.
\[H_v = I-2vv^*\]

Proposition 2.2.3  For $x,y\in\mathbb{R}^n$ with $||x||=||y||$, and $x\neq y$, the Householder reflector $H_v$ for $v=\frac{x-y}{||x-y||}$ maps $x$ to $y$.
\[H_vx=y\]

Proposition 2.2.4  Each $A\in\mathbb{R}^{m\times n}$ has a QR decomposition.

Algorithm 2.2.5  The Householder QR factorization algorithm can be used to calculate the $QR$ decomposition for a matrix $A=QR$. The following algorithm computes $R$ by leaving the result in place of $A$ and then computes the columns of $Q$ by applying the Householder transformation for each of the standard basis vectors.

2.3LU DecompositionJ7NT

Definition 2.3.1  The LU decomposition of a square matrix $A\in\mathbb{C}^{n\times n}$ is the lower triangular matrix $L\in\mathbb{C}^{n\times n}$ and the upper triangular matrix $U\in\mathbb{C}^{n\times n}$ such that
\[A=LU\]

Proposition 2.3.2  Gaussian elimination can be used to construct the matrices $L\in\mathbb{C}^{n\times n}$ and $U\in\mathbb{C}^{n\times n}$ in terms of a set of lower triangular matrices $\{L_1^{-1},\dots,L_n^{-1}\}\subset\mathbb{C}^{n\times n}$ such that
[A = LU = Lb v

Proposition 2.3.3  Let $L_k\in\mathbb{n\times n}$ be the row operation matrix defined by
\[L_{k} = \begin{pmatrix}
1&\ &\ &\ &\ &\ \\
\ &\ddots&\ &\ &\ &\ \\
\ &\ &1&\ &\ &\ \\
\ &\ &-\ell_{k+1,k}&\ddots &\ &\ \\
\ &\ &\vdots&\ &\ddots &\ \\
\ &\ &-\ell_{n,k}&\ &\ &1
\end{pmatrix}\]
where $\ell_{j,k} = \frac{a_{j,k}}{a_{k,k}}$ for $j = k+1,\dots,n$. The inverse $L_k^{-1}$ is the matrix
\[L_{k} = \begin{pmatrix}
1&\ &\ &\ &\ &\ \\
\ &\ddots&\ &\ &\ &\ \\
\ &\ &1&\ &\ &\ \\
\ &\ &\ell_{k+1,k}&\ddots &\ &\ \\
\ &\ &\vdots&\ &\ddots &\ \\
\ &\ &\ell_{n,k}&\ &\ &1
\end{pmatrix}\]

Algorithm 2.3.4  The LU decomposition algorithm without pivoting can be used to compute the LU decomposition of a matrix $A$ by using Gaussian elimination.

Algorithm 2.3.5  The LU decomposition algorithm with partial pivoting can be used to compute the LU decomposition of a matrix $A$ by using Gaussian elimination in a more stable way than without pivoting.

2.4Cholesky Decomposition8J25

Definition 2.4.1  A positive definite matrix $A\in\mathbb{C}^{n\times n}$ is a Hermitian matrix such that $x^*Ax\geq 0,\ \forall x\in\mathbb{C}^n$ and "$=$" holds only when $x=0$.

Proposition 2.4.2  If a matrix $A\in\mathbb{C}^{n\times n}$ is positive definite, then the block diagonal sub-matrices are also positive definite.

Definition 2.4.3  The Cholesky decomposition of a positive definite matrix $A\in\mathbb{C}^{n\times n}$ is an upper triangular matrix $R\in\mathbb{C}^{n\times n}$ with positive diagonal elements such that
\[ A = R^*R\]

Theorem 2.4.4  Any positive definite matrix has a Cholesky decomposition.

Algorithm 2.4.5  The Cholesky decomposition algorithm can be used to compute the Cholesky decomposition of a matrix $A\in\mathbb{C}^{n\times n}$.

3Eigenvalue Problems6W74

3.1Rayleigh Quotient and Inverse IterationWPFA

Proposition 3.1.1  Let $|\lambda_1| > |\lambda_2| > \dots > |\lambda_n|$ be the eigenvalues of a real matrix $A\in\mathbb{R}^{n\times n}$ with corresponding eigenbasis $q_1,q_2,\dots,q_n\in\mathbb{R}^n$. If $v^{(0)}\in\mathbb{R}^n$ is some vector such that $\langle q, v^{(0)}\rangle \neq 0$, then
\[v^{(k)} = \frac{A^kv^{(0)}}{||A^kv^{(0)}||} \to \pm q_1\quad \text{ as } k\to \infty\]

Definition 3.1.2  The Rayleigh quotient of a symmetric matrix $A\in\mathbb{R}^{n\times n}$ is the function $r_A:\mathbb{R}^{n} - \{0\} \to \mathbb{R}$ defined by $r_A(x) = \frac{x^T Ax}{||x||^2}$.

Proposition 3.1.3  A nonzero vector $x\in\mathbb{R}^n$ is an eigenvector of a symmetric matrix $A\in\mathbb{R}^{n\times n}$ if and only if $x$ is a critical point of $r_A(x)$.

Algorithm 3.1.4  The power iteration algorithm can be used to approximate the largest absolute eigenvalue and corresponding eigenbases vector of a matrix $A$.

Algorithm 3.1.5  The Inverse iteration algorithm can be used to

Algorithm 3.1.6  The Rayleigh quotient iteration algorithm can be used to approximate all the eigenvalue and corresponding eigenbases vectors of a matrix $A$.

3.2Schur Decomposition5JKD

Definition 3.2.1  A similar matrix $A\in\mathbb{C}^{n\times n}$ to another matrix $B\in\mathbb{C}^{n\times n}$ is a matrix where there exists a non-singular matrix $S\in\mathbb{C}^{n\times n}$ such that $A = S^{-1}BS$.

Definition 3.2.2  The singularity transformation for a non-singular matrix $S\in\mathbb{C}^{n\times n}$ is the mapping $\text{Sim}_S : \mathbb{C}^{n\times n} \to \mathbb{C}^{n\times n}$ defined by
\[\text{Sim}_S(A) = S^{-1}AS\]

Definition 3.2.3  A matrix $A\in\mathbb{C}^{n\times n}$ is diagonalizable iff there exists a non-singular $S\in\mathbb{C}^{n\times n}$ such that $S^{-1}AS$ is a diagonal matrix.

Definition 3.2.4  A matrix $A\in\mathbb{C}^{n\times n}$ is unitary diagonalizable iff there exists a unitary $U\in\mathbb{C}^{n\times n}$ such that $U^* AU$ is a diagonal matrix.

Definition 3.2.5  A matrix $A\in\mathbb{C}^{n\times n}$ is unitary triangularizable iff there exists a unitary $U\in\mathbb{C}^{n\times n}$ such that $U^* AU$ is upper triangular matrix.

Definition 3.2.6  The Schur decomposition of $A\in\mathbb{C}^{n\times n}$ is a unitary Q and upper triangular T such that
\[A = QTQ^*\]

Proposition 3.2.7  A matrix is diagonalizable iff A has n eignenvectors that form a basis of $\mathbb{C}^n$.

Proposition 3.2.8  A matrix is unitary diagonalizable iff $A^*A = AA^*$

Proposition 3.2.9  Any matrix $A\in\mathbb{C}^{n\times n}$ has a Schur decomposition $A = QTA^*$.

Proposition 3.2.10  For $A\in\mathbb{C}^{n\times n}$ let $\{A_k\}$ be the sequence of matrices defined by $A_0= A$, $A_{k-1} = Q_kR_k$, and $A_k = R_k Q_k$, then

  1. $A_k$ has the same eigenvalues of $A$.
  2. $A^{k} = (Q_1\dots Q_k)(R_k\dots R_1)$.

Theorem 3.2.11  If $A\in\mathbb{C}^{n\times n}$ is a matrix with eigenvalues $\lambda_1,\dots,\lambda_n$ such that $|\lambda_1| > \dots > |\lambda_n| > 0$, then the sequence of matrices $\{T_k\}$ defined by $T_0= A$, $T_{k-1} = Q_kR_k$, and $T_k = R_k Q_k$ converges to $T$ and $Q_1Q_2\dots$ converges to $Q$ in the Schur decomposition $A = QTQ^*$.

Algorithm 3.2.12  The pure QR Schur decomposition algorithm can be used to iteratively compute the Schur decomposition of a matrix $A=QTQ^*$.

Def 3.2.13  An matrix $H\in\mathbb{C}^{n\times n}$ is upper Hessenberg iff $H_{i,j}=0$ for $i>j+1$.

Proposition 3.2.14  All square matrices are unitarily similar to an upper hessenber matrix.

Algorithm 3.2.15  The practical QR Schur decomposition algorithm can be used to iteratively compute the Schur decomposition of a matrix $A=QTQ^*$ more efficiently.

3.3Arnoldi and Lanczos IterationFH81

Definition 3.3.1  the Krylov subspace of order r generated by $A\in\mathbb{C}^{n\times n}$ and $b\in\mathbb{C}^n$ is the subspace $\mathcal{K}_r(A,b)$ defined by
\[\mathcal{K}_r(A,b) = \text{span}\{b,Ab,A^2b,\dots,A^{r-1}b\}\]

Definition 3.3.2  The Krylov matrix of order r generated by $A\in\mathbb{C}^{n\times n}$ and $b\in\mathbb{C}^n$ is the matrix $K_r(A,b)$ defined by
\[K_r(A,b) = \text{matrix with columns }(b,Ab,A^2b,\dots,A^{r-1}b)\]

Definition 3.3.3  A monic polynomial of degree d is a function $p:\mathbb{C}\to\mathbb{C}$ of the form
\[p(t) = t^d + c_{d-1}t^{d-1} + \dots c_{1}t^{1} + c_0\]

Definition 3.3.4  The minimal polynomial of A with respect to b is the non-zero monic polynomial of lowest degree such that $p(A)b=0$.

Proposition 3.3.5  Let $m$ be the degree of the minimal polynomial of $A$ with respect to $b$, then

Algorithm 3.3.6  The Arnoldi iteration algorithm can be used to find an orthonormal eigenbasis of a matrix $A\in\mathbb{C}^{n\times n}$.

Algorithm 3.3.7  The Lanczos iteration algorithm can be used to find an orthonormal eigenbasis of a matrix $A\in\mathbb{C}^{n\times n}$.

Proposition 4  The projections $\text{Proj}_{W}(v),\text{Proj}_{W^\perp}(v)$ of $\mathbf{v}\in \mathbb{C}^n$ onto $W,W^\perp\subset\mathbb{C}^n$ are orthogonal.