Bayesian Statistics and Computing

---

# Bayesian Statistics and Computing

## Lecture 16: Numerical Algorithms for Eigenanalysis

#### *Yanfei Kang | BSC | Beihang University*

---
#  Finding eigenvalues and eigenvectors

- So far, we have only considered the characteristic polynomial approach to find the eigenvalues of a matrix.

- Once we have the eigenvalues, we have been solving the homogeneous equation to find the corresponding eigenvectors.

- The process is: find the eigenvalues first, and then find the corresponding eigenvectors.

---
# Practicalities

- Unfortunately, this is an impractical approach for `$n > 4$`.

- We will bypass the characteristic polynomial and now take a different approach
    - The Power Method
    - QR decomposition

---
class: inverse, center, middle
# The Power Method

---
# The Power Method

- The Power Method finds the dominant eigenvalue `$\lambda_1$` and corresponding dominant eigenvector `$v_1$` of a matrix `$A$`.

- The dominant eigenvalue is the one with the largest modulus (absolute value for real eigenvalues).

- The Power Method is an iterative approach that generates a sequence of scalars that converge to `$\lambda_1$` and a sequence of vectors that converge to `$v_1$`.

- The Power Method works well (converges quickly) when the dominant eigenvalue is clearly dominant.

---
# The Power Method

- It works by starting with an initial vector `$x_0$`, transforming to `$x_1 = Ax_0$`, transforming `$x_1$` to `$x_2 = Ax_1$`, etc.

`$$x_1 = Ax_0;~x_2 = Ax_1 = A^2x_0; \cdots; x_k = A^kx_0.$$`
- As `$k\rightarrow \infty$`, `$x_k \rightarrow v_1$`.

---
# Estimation of eigenvalues

So once we have an eigenvector estimate we can quickly estimate the corresponding eigenvalue

`$$Ax = \lambda x \Rightarrow x^T Ax = \lambda x^Tx \Rightarrow \lambda = \frac{x^TAx}{x^Tx} = \frac{x^TAx}{||x||^2} = q^TAq.$$`

---
#  Practical Power method (normalised)

Since the components of `$x_k$` just get larger and larger as the Power Method iterates, and we really just want to know the direction of `$v_1$` (not its magnitude), we normalise each `$x_k$`.

So the Power Method can be summarised as:

- Set initial vector `$q_0 = x_0/(||x_0||)$`.

- Repeat for k

- Compute `$x_k = Aq_{k-1}$`
    - Normalise `$q_k = x_k/||x_k||$`
    - Estimate `$\lambda_k = q_k^TAq_k$`
    
---
# Comments

- The Power method is not expected to converge if the matrix A is not diagonalisable.

- Convergence rate depends on how dominant `$\lambda_1$` is.

- Google uses it to calculate the PageRank and Twitter uses it to show users recommendations of who to follow.

- And for non-dominant eigenvalues/vectors?

---
# Lab session

Now use the power method to redo your google pagerank problem.

---
class: inverse, center, middle
# QR decomposition

---
#  Eigenvalue Revealing Decomposition

- It would be nice if we could get our matrix `$A$` into an **eigenvalue revealing decomposition** like  Schur decomposition `$A = QSQ^T$`.

- So we can read off the eigenvalues (all of them) from the diagonal.

- We will do it iteratively using QR decomposition: `$A = QR$`.

- QR decomposition is not an eigenvalue revealing decomposition, but it will help us with our aim.

- QR decomposition can be done with Gram-Schmidt Orthogonalisation (GSO) algorithm.

---
# GSO algorithm

Any set of basis vectors `$(a_1, a_2, \cdots, a_n)$` can be transformed to an orthonormal basis `$(q_1,q_2,\cdots,q_n)$` by:

---
# GSO `$\rightarrow$` QR

---
# QR

For any `$m \times n$` matrix `$A$`, we can express `$A$` as `$A = QR$`.

- `$Q$` is `$m\times m$`, orthogonal.

- `$R$` is `$m\times n$`, upper triangular.

For (non-square) tall matrices `$A$`  with `$m > n$`, the last
`$m-n$` rows of `$R$` are all zero, so we can express `$A$` as:

`$$A = QR = (Q_1, Q_2) \left(\begin{array}{cc}R_1 \\ \mathbf{0} \end{array}\right) = Q_1R_1.$$`

---
# Solution to `$Ax=b$` by QR

If we have `$A=QR$` or (even better) the economy form `$A=Q_1R_1$`, then the linear system `$Ax = b$` can be easily solved:
`$$Ax = b$$`
`$$(Q_1R_1)x=b$$`
`$$R_1x = Q_1^Tb$$`
and `$x$` is found through back substitution.

---
# Example

``` r
set.seed(2023-09-08)
n <- 5000; p <- 100
X <- matrix(rnorm(n * p), n, p)
y <- rnorm(n)
```

---
#  Colinearity

``` r
W <- cbind(X, X[, 1] + rnorm(n, sd = 0.00000000000001))
solve(crossprod(W), crossprod(W, y))
#> Error in solve.default(crossprod(W), crossprod(W, y)): system is computationally singular: reciprocal condition number = 1.9488e-16
```

---
#  QR decomposition

``` r
qr.coef(qr(W), y)
#>   [1]  0.0207049604 -0.0107798798 -0.0053346446 -0.0139042768 -0.0165625816 -0.0247368954 -0.0055936520 -0.0050670925
#>   [9] -0.0141895452  0.0014650371  0.0105871852  0.0236855005 -0.0009123881  0.0395699043 -0.0045459668  0.0156426008
#>  [17]  0.0368986437 -0.0062241571  0.0003719454 -0.0001045447  0.0105437196  0.0032069072 -0.0099112962  0.0092091722
#>  [25] -0.0056283760 -0.0101836666  0.0004409933  0.0079016686  0.0101626882 -0.0208409039 -0.0054924536  0.0064271925
#>  [33] -0.0065590754 -0.0037863800  0.0072126283 -0.0013686652 -0.0261588414 -0.0045260288 -0.0204600332 -0.0108791075
#>  [41]  0.0100509988  0.0021846634  0.0096076628 -0.0256744385 -0.0002072185 -0.0203092434  0.0199490920  0.0245456994
#>  [49]  0.0141368132 -0.0075821183 -0.0331999816  0.0070728928 -0.0251681102 -0.0145501245  0.0124959304  0.0101724169
#>  [57] -0.0041669575  0.0118368027  0.0058702645  0.0017322817  0.0008573064 -0.0068535688 -0.0101274630 -0.0093643459
#>  [65]  0.0045194364 -0.0224489607  0.0175468581  0.0135508696  0.0048772551  0.0168445216 -0.0254587733 -0.0012700923
#>  [73] -0.0106302388 -0.0017762102  0.0077490915  0.0110904920 -0.0144777066 -0.0079745076  0.0112005231 -0.0012697611
#>  [81] -0.0024046193 -0.0094265032  0.0234028545 -0.0087827166  0.0097232715 -0.0188332734 -0.0151177605  0.0129435368
#>  [89] -0.0086064316  0.0048073761 -0.0209634872 -0.0133873152  0.0203494525  0.0031885641 -0.0088269877  0.0064179865
#>  [97] -0.0016902431 -0.0149829401 -0.0018118949 -0.0161594580            NA
```

---
class: inverse, center, middle
# QR algorithm

---
# QR algorithm

This algorithm computes an upper triangular matrix `$S$` and a unitary matrix `$Q$` such that `$A = QSQ^T$` is the Schur decomposition of `$A$`.

1. Set `$A_0 := A$`.
2. for `$k = 1, 2, \cdots$`, 
    - `$A_{k-1} = Q_{k-1}R_{k-1}.$`
    - `$A_k := R_{k-1}Q_{k-1}.$`
3. Set `$S := A_{\infty}.$`

---
# Lab session

- Go to R code up QR algorithm.

- Use QR algorithm on `$A = \left( \begin{array}{cc}  1 & 2 \\ 3 & 4 \end{array}\right)$`.

---
# Further readings

- Iterative Methods for Computing Eigenvalues and Eigenvectors by Maysum Panju at University of Waterloo. Available [here](http:/yanfei.site/docs/bsc/svdnum.pdf).

- [Practical QR algorithm](https://www.math.kth.se/na/SF2524/matber15/qrmethod.pdf).

- Computational Statistics by  James E. Gentle in 2009.