layout: true --- class: inverse, center, middle background-image: url(../figs/titlepage16-9.png) background-size: cover <br> <br> # Bayesian Statistics and Computing ## Lecture 0: Course Introduction <img src="../figs/slides.png" width="150px"/> #### *Yanfei Kang | BSC | Beihang University* --- # General information - **Credits: ** 2 credits - **Lecturer: ** Yanfei Kang. For now, you can know me better via my personal webpage: http://yanfei.site. - **Tutor:** Shengjie Wang - **Language: ** Taught in Chinese. Materials are in English. - **Computer language: R** - **Reception hours: ** Questions concerned with this course can be asked via email: yanfeikang@buaa.edu.cn. - **Lecture notes: ** available on on *https://yanfei.site/teaching/bsc*. --- # References 1. [Advanced statistical computing](https://bookdown.org/rdpeng/advstatcomp/). Roger Peng. In progress. 2022. 2. [Statistical computing (Chinese version)](https://yanfei.site/docs/statscompbook/index.html). Yanfei Kang and Feng Li. In progress. 2023. 3. [Introduction to scientific programming and simulation using R](https://yanfei.site/docs/sc/spuRs.pdf). Owen Jones, Robert Maillardet, Andrew Robinson. 2nd Edition. CRC press. 2014. ISBN: 9781466569997. 4. [R programming for data science](https://bookdown.org/rdpeng/rprogdatascience/). Roger Peng. 2022. --- # Unit objectives ![The process of statistical modeling.](./sc.png) --- # Example: linear models Consider the simple linear model: `$$y = \beta_0 + \beta_1 x + \epsilon.$$` | Component | Implementation | | :-------- | :------------- | | Model | Linear regression | | Principle/Technique | Likelihood principle | | Algorithm | Maximization (analytical) | | Statistic | `\(\beta_0\)`, `\(\beta_1\)` | --- # Two statements #### *Computer numbers are not the same as real numbers, and the arithmetic operations on computer numbers are not exactly the same as those of ordinary arithmetic.* #### *The form of a mathematical expression and the way the expression should be evaluated in actual practice may be quite different.* --- # Key issues on the computer 1. Overflow - `NA`s or `Inf`s for very big numbers. 2. Underflow - Similar. 3. Near linear dependence in matrix computation. *Finite precision nature of computer arithmetic*: computers can represent a finite number of numbers. ```r .Machine$double.xmin #> [1] 2.225074e-308 ``` *Note that on most platforms smaller positive values than .Machine$double.xmin can occur. On a typical R platform the smallest positive double is about 5e-324.* --- # Textbooks vs. computers - Disconnection between textbook and what *should* be implemented on the computer. - Textbooks: convenient mathematical formulas. - Computers: directly translating these formulas into computer code is usually not advisable. --- # Using Logarithms Consider the univariate Normal distribution - We compute this value using `dnorm()` in R. ```r dnorm(1) #> [1] 0.2419707 ``` - How about `dnorm(40)`? - Usually you can get away with computing the log of this value (and with `dnorm()` you can set the option `log = TRUE`). - Computing densities with logarithms is much more numerically stable. - `\(\frac{f(x)}{g(x)}\)` `\(\Rightarrow\)` `\(\exp(\log f(x) - \log(g(x)))\)`. --- # Solution to linear regression Linear regression model in matrix form: `$$y = X\beta + \varepsilon.$$` - The solution is written as `$$\hat{\beta} = (X^\prime X)^{-1}X^\prime y.$$` - In R, this could be translated literally as ```r betahat <- solve(t(X) %*% X) %*% t(X) %*% y ``` - The formula presented above is *only* computed in textbooks. - The direct inverse of `\(X^\prime X\)` is very expensive computationally. --- # Simpler approaches - Take the normal equations, `$$X^\prime X\beta = X^\prime y$$` and solve them directly. - In R, we could write ```r solve(crossprod(X), crossprod(X, y)) ``` - Rather than compute the inverse of `\(X^\prime X\)`, we directly compute `\(\hat{\beta}\)` via [Gaussian elimination](https://en.wikipedia.org/wiki/LU_decomposition) `\(\Rightarrow\)` More numerically stable and being *much* faster. - R uses [QR decomposition](https://en.wikipedia.org/wiki/QR_decomposition). Its thin version can also well deals with colinearity (discussed later in this course). --- # Example - For example, here's a fake dataset of `\(n=5000\)` observations with `\(p=100\)` predictors. ```r set.seed(2023 - 9 - 8) n <- 5000 p <- 100 X <- matrix(rnorm(n * p), n, p) y <- rnorm(n) ``` ```r library(microbenchmark) microbenchmark(solve(t(X) %*% X) %*% t(X) %*% y, solve(crossprod(X), crossprod(X, y)), qr.solve(qr(X), y)) #> Unit: milliseconds #> expr min lq mean median uq max neval #> solve(t(X) %*% X) %*% t(X) %*% y 65.98083 72.76186 103.47232 80.92221 103.13792 1653.59700 100 #> solve(crossprod(X), crossprod(X, y)) 29.76544 32.49361 37.46781 35.66668 39.97714 67.16119 100 #> qr.solve(qr(X), y) 43.71220 51.01190 57.76785 55.68844 63.29145 83.03870 100 ``` --- # Multivariate normal density Now, can you do the following? 1. Write the `\(p\)`-dimensional multivariate normal density. 2. Think about how to evaluate it in computers. 3. Do it in R. *Hints:* 1. [Cholesky decomposition](https://en.wikipedia.org/wiki/Cholesky_decomposition): `\(\Sigma = R^TR\)`. 2. By this, you can also easily calculate the determinant of `\(\Sigma\)`. --- # Unit objectives 1. Master solution of nonlinear equations and optimization tools. 2. Generation of random numbers and simulation. 2. Understand Bayesian methods and computing. 3. Learn about numerical linear algebra. --- # Top ten algorithms In January 2000, *Computing in Science and Engineering magazine* published a list of 10 algorithms which (in their words) had the "greatest influence on the development and practice of science and engineering in the 20th century". ![The process of statistical modeling.](./ten.png) --- # About assignments 1. Please use **SPOC** for assignment submission. 2. Only one single pdf or html file is acceptable. I recommend you write your assignments using [Rmarkdown](https://rmarkdown.rstudio.com/), a productive notebook interface to weave together narrative text and code to produce elegantly formatted output. 3. Use meaningful file names: `Name_StudentNO_n.pdf`.