layout: true --- class: inverse, center, middle background-image: url(../figs/titlepage16-9.png) background-size: cover <br> <br> # Bayesian Statistics and Computing ## Lecture 16: Numerical Algorithms for Eigenanalysis <img src="../figs/slides.png" width="150px"/> #### *Yanfei Kang | BSC | Beihang University* --- # Finding eigenvalues and eigenvectors - So far, we have only considered the characteristic polynomial approach to find the eigenvalues of a matrix. - Once we have the eigenvalues, we have been solving the homogeneous equation to find the corresponding eigenvectors. - The process is: find the eigenvalues first, and then find the corresponding eigenvectors. --- # Practicalities - Unfortunately, this is an impractical approach for `\(n > 4\)`. - We will bypass the characteristic polynomial and now take a different approach - The Power Method - QR decomposition --- class: inverse, center, middle # The Power Method --- # The Power Method - The Power Method finds the dominant eigenvalue `\(\lambda_1\)` and corresponding dominant eigenvector `\(v_1\)` of a matrix `\(A\)`. - The dominant eigenvalue is the one with the largest modulus (absolute value for real eigenvalues). - The Power Method is an iterative approach that generates a sequence of scalars that converge to `\(\lambda_1\)` and a sequence of vectors that converge to `\(v_1\)`. - The Power Method works well (converges quickly) when the dominant eigenvalue is clearly dominant. --- # The Power Method - It works by starting with an initial vector `\(x_0\)`, transforming to `\(x_1 = Ax_0\)`, transforming `\(x_1\)` to `\(x_2 = Ax_1\)`, etc. `$$x_1 = Ax_0;~x_2 = Ax_1 = A^2x_0; \cdots; x_k = A^kx_0.$$` - As `\(k\rightarrow \infty\)`, `\(x_k \rightarrow v_1\)`. --- # Estimation of eigenvalues So once we have an eigenvector estimate we can quickly estimate the corresponding eigenvalue `$$Ax = \lambda x \Rightarrow x^T Ax = \lambda x^Tx \Rightarrow \lambda = \frac{x^TAx}{x^Tx} = \frac{x^TAx}{||x||^2} = q^TAq.$$` --- # Practical Power method (normalised) Since the components of `\(x_k\)` just get larger and larger as the Power Method iterates, and we really just want to know the direction of `\(v_1\)` (not its magnitude), we normalise each `\(x_k\)`. So the Power Method can be summarised as: - Set initial vector `\(q_0 = x_0/(||x_0||)\)`. - Repeat for k - Compute `\(x_k = Aq_{k-1}\)` - Normalise `\(q_k = x_k/||x_k||\)` - Estimate `\(\lambda_k = q_k^TAq_k\)` --- # Comments - The Power method is not expected to converge if the matrix A is not diagonalisable. - Convergence rate depends on how dominant `\(\lambda_1\)` is. - Google uses it to calculate the PageRank and Twitter uses it to show users recommendations of who to follow. - And for non-dominant eigenvalues/vectors? --- # Lab session Now use the power method to redo your google pagerank problem. --- class: inverse, center, middle # QR decomposition --- # Eigenvalue Revealing Decomposition - It would be nice if we could get our matrix `\(A\)` into an **eigenvalue revealing decomposition** like Schur decomposition `\(A = QSQ^T\)`. - So we can read off the eigenvalues (all of them) from the diagonal. - We will do it iteratively using QR decomposition: `\(A = QR\)`. - QR decomposition is not an eigenvalue revealing decomposition, but it will help us with our aim. - QR decomposition can be done with Gram-Schmidt Orthogonalisation (GSO) algorithm. --- # GSO algorithm Any set of basis vectors `\((a_1, a_2, \cdots, a_n)\)` can be transformed to an orthonormal basis `\((q_1,q_2,\cdots,q_n)\)` by: <center> <img src="./figs/gso.png" height="400px" /> </center> --- # GSO `\(\rightarrow\)` QR <center> <img src="./figs/QR.png" height="200px" /> </center> --- # QR For any `\(m \times n\)` matrix `\(A\)`, we can express `\(A\)` as `\(A = QR\)`. - `\(Q\)` is `\(m\times m\)`, orthogonal. - `\(R\)` is `\(m\times n\)`, upper triangular. For (non-square) tall matrices `\(A\)` with `\(m > n\)`, the last `\(m-n\)` rows of `\(R\)` are all zero, so we can express `\(A\)` as: `$$A = QR = (Q_1, Q_2) \left(\begin{array}{cc}R_1 \\ \mathbf{0} \end{array}\right) = Q_1R_1.$$` --- # Solution to `\(Ax=b\)` by QR If we have `\(A=QR\)` or (even better) the economy form `\(A=Q_1R_1\)`, then the linear system `\(Ax = b\)` can be easily solved: `$$Ax = b$$` `$$(Q_1R_1)x=b$$` `$$R_1x = Q_1^Tb$$` and `\(x\)` is found through back substitution. --- # Example ``` r set.seed(2023-09-08) n <- 5000; p <- 100 X <- matrix(rnorm(n * p), n, p) y <- rnorm(n) ``` --- # Colinearity ``` r W <- cbind(X, X[, 1] + rnorm(n, sd = 0.00000000000001)) solve(crossprod(W), crossprod(W, y)) #> Error in solve.default(crossprod(W), crossprod(W, y)): system is computationally singular: reciprocal condition number = 1.9488e-16 ``` --- # QR decomposition ``` r qr.coef(qr(W), y) #> [1] 0.0207049604 -0.0107798798 -0.0053346446 -0.0139042768 -0.0165625816 -0.0247368954 -0.0055936520 -0.0050670925 #> [9] -0.0141895452 0.0014650371 0.0105871852 0.0236855005 -0.0009123881 0.0395699043 -0.0045459668 0.0156426008 #> [17] 0.0368986437 -0.0062241571 0.0003719454 -0.0001045447 0.0105437196 0.0032069072 -0.0099112962 0.0092091722 #> [25] -0.0056283760 -0.0101836666 0.0004409933 0.0079016686 0.0101626882 -0.0208409039 -0.0054924536 0.0064271925 #> [33] -0.0065590754 -0.0037863800 0.0072126283 -0.0013686652 -0.0261588414 -0.0045260288 -0.0204600332 -0.0108791075 #> [41] 0.0100509988 0.0021846634 0.0096076628 -0.0256744385 -0.0002072185 -0.0203092434 0.0199490920 0.0245456994 #> [49] 0.0141368132 -0.0075821183 -0.0331999816 0.0070728928 -0.0251681102 -0.0145501245 0.0124959304 0.0101724169 #> [57] -0.0041669575 0.0118368027 0.0058702645 0.0017322817 0.0008573064 -0.0068535688 -0.0101274630 -0.0093643459 #> [65] 0.0045194364 -0.0224489607 0.0175468581 0.0135508696 0.0048772551 0.0168445216 -0.0254587733 -0.0012700923 #> [73] -0.0106302388 -0.0017762102 0.0077490915 0.0110904920 -0.0144777066 -0.0079745076 0.0112005231 -0.0012697611 #> [81] -0.0024046193 -0.0094265032 0.0234028545 -0.0087827166 0.0097232715 -0.0188332734 -0.0151177605 0.0129435368 #> [89] -0.0086064316 0.0048073761 -0.0209634872 -0.0133873152 0.0203494525 0.0031885641 -0.0088269877 0.0064179865 #> [97] -0.0016902431 -0.0149829401 -0.0018118949 -0.0161594580 NA ``` --- class: inverse, center, middle # QR algorithm --- # QR algorithm This algorithm computes an upper triangular matrix `\(S\)` and a unitary matrix `\(Q\)` such that `\(A = QSQ^T\)` is the Schur decomposition of `\(A\)`. 1. Set `\(A_0 := A\)`. 2. for `\(k = 1, 2, \cdots\)`, - `\(A_{k-1} = Q_{k-1}R_{k-1}.\)` - `\(A_k := R_{k-1}Q_{k-1}.\)` 3. Set `\(S := A_{\infty}.\)` --- # Lab session - Go to R code up QR algorithm. - Use QR algorithm on `\(A = \left( \begin{array}{cc} 1 & 2 \\ 3 & 4 \end{array}\right)\)`. --- # Further readings - Iterative Methods for Computing Eigenvalues and Eigenvectors by Maysum Panju at University of Waterloo. Available [here](http:/yanfei.site/docs/bsc/svdnum.pdf). - [Practical QR algorithm](https://www.math.kth.se/na/SF2524/matber15/qrmethod.pdf). - Computational Statistics by James E. Gentle in 2009.