- How to install and use
**R** **R**basics**R**coding style- Some nice
**R**tips

School of Economics and Management

Beihang University

http://yanfei.site

- How to install and use
**R** **R**basics**R**coding style- Some nice
**R**tips

install.packages(c("forecast", "sos", "formatR"))

- Standard R comes with some standard packages installed for basic data management, analysis, and graphical tools.
- More than 10,000 packages available on CRAN! See http://cran.r-project.org.
`install.packages('forecast')`

to install an package called ‘forecast’.`library(forecast)`

before using the package.

- Totally free!
- Download R on its official website.
- A new major version of R comes out once a year, and there are 2-3 minor releases each year.

- Direct
**R** - Rstudio
- One of the most popular ways to run R.
- Free, open-source integrated development environment (IDE) for R.
- Many additional fantastic features.
- Updated a couple of times a year.

- Command line in Linux and Unix.

- What editor do you usually use?
- Use a good text editor such as vim, sublime text, text wrangler, notepad, etc
- With syntax highlighting, otherwise, it’s hard to detect errors
- Or use an Integrated Development Environment (IDE) like RStudio

- Syntax highlighting
- Able to evaluate R code + by line + by selection + entire file
- Command auto-completion

## simple maths 1 + 2 + 3 1 + 2 * 3 ## assign a value to a variable x <- 1 y <- 2 z <- c(x,y) z ## function examples exp(1) cos(3.141593) log2(1)

- Numerical vectors
- Logical vectors
- Character vectors
- Length of a vector
- Vector calculations
- Extract some elements of a vector

## vectors c(0, 1, 1, 2, 3, 5, 8) 1:10 seq(1, 9, 2) rep(1, 10) length(rep(1, 10)) ## character vectors c("Hello world", "Hello R interpreter") ## vector calculation c(1, 2, 3, 4) + c(10, 20, 30, 40) c(1, 2, 3, 4) + 1 ## you can refer to elements by location in a vector b <- c(1,2,3,4,5,6,7,8,9,10,11,12) length(b) b b[7] b[1:6] b[c(1,6,11)] b > 5 b[b > 5]

- Create a matrix:
`matrix()`

- Dimension of a matrix:
`dim()`

- Transpose of a matrix:
`t()`

- Extract elements from a matrix.
- Combine two or more matrices:
`rbind()`

,`cbind()`

## create a matrix m <- matrix(c(1:6), 2, 3) n <- matrix(c(8:13), 2, 3) dim(m) t(m) m[1, 2] m[1, ] cbind(m, n) rbind(m, n)

- Special data structure that matrix could not handle.
- Data length are not the same.
- Data type are not the same.

- Create a list:
`list()`

- Extract elements of a list:
`[[]]`

or`$`

l <- list(a = c(1, 2), b = 'apple')

`data.frame()`

: tightly coupled collections of variables which share many of the properties of matrices and of lists, used as the fundamental data structure by most of R’s modeling software.- In most cases, the operation with a data frame is similar to matrix operation.

L3 <- LETTERS[1:3] fac <- sample(L3, 10, replace = TRUE) d <- data.frame(x = 1, y = 1:10, fac = fac)

- Create a function

f <- function(x, y) { z <- c(x + 1, y + 1) return(z) } f(1, 2)

- Load the function:
`source()`

- Execute your function
- When should you write a function?

- Syntax

if (condition){ do something } else { do something }

- Example

x <- 0 if (x > 1) { print('x is larger than 1') } else { print('x is not larger than 1') }

- Example

x <- 1:10 for(i in x) { print(i^2) }

- Write a function
`MySummary()`

where the input argument is x can be any vector and the output is a list that contains the basic summary (mean, variance, length, max and minimum values) of the vector you have supplied to the function. - Test your function with some vectors (that you make up by yourself).

- File names should end in .R and, of course, be
*meaningful*. - GOOD:
`predict_ad_revenue.R`

- BAD:
`foo.R`

- The preferred form for variable names is all lower case letters and words separated with dots (variable.name), but variableName is also accepted. Generally, variable names should be nouns.
- GOOD: avg.clicks
- OK: avgClicks
- BAD: avg_Clicks

- Function names have initial capital letters and no dots. Function names are mostly verbs.
- GOOD: CalculateAvgClicks
- BAD: calculate_avg_clicks , calculateAvgClicks

- Choose a consistent naming style

- Don’t use underscores (_) or hyphens (-).
- Avoid using names of existing functions and variables like
`mean`

,`median`

etc. - Avoid using meaningless names like a, b, c, …, aa, bb, cc, …

- around operators (=, +, -, <-, etc)
- put a space after a comma, and never before

x <- c(1:10) x.average<-mean(x,na.rm=TRUE)

\(\Rightarrow\)

x.average <- mean(x, na.rm = TRUE)

- split long lines at meaningful places

Don’t be afraid of splitting one long line into individual pieces!

n <- matrix(sample(1:100, 9), nrow = 3, ncol = 3, byrow = TRUE)

\(\Rightarrow\)

n <- matrix(sample(1:100, 9), nrow = 3, ncol = 3, byrow = TRUE)

- An opening curly brace should never go on its own line and should always be followed by a new line.
- A closing curly brace should always go on its own line, unless it’s followed by else.
- Always begin the body of a block on a new line.
- Always indent the code inside curly braces.

if (y < 0) {print("y is negative")}

\(\Rightarrow\)

if (y < 0) { print("y is negative") }

- Use two spaces
- Can help in detecting errors in your code because it can expose lack of symmetry
- Reindenting using RStudio

if (y < 0) { print("y is negative") }

\(\Rightarrow\)

if (y < 0) { print("y is negative") }

- Reformat and reindent in Rstudio.
**formatR**package in**R**. You can even make a folder of`.R`

files tidy using`tidy_dir()`

.

- Add a Header for your file
- Add lots of comments
- Use blank lines to separate blocks of code and comments to say what the block does. Remember that in a few months, you may not follow your own code any better than a stranger.

x <- c(1:10) x.mean = mean(x) x.var = var(x)

\(\Rightarrow\)

## ============================================= ## Title ## Author: Yanfei Kang ## Date: Mar 23, 2017 ## Description: your purpose ## ============================================= x <- c(1:10) ## getting the mean of x x.mean = mean(x) ## getting the variance of x x.var = var(x)

- Functions should contain a comments section immediately below the function definition line.
- These comments should include
- a one-sentence description of the function
- a list of the function’s arguments, denoted by Args:, with a description of each (including the data type)
- a description of the return value, denoted by Returns:.
- The comments should be descriptive enough that a caller can use the function without reading any of the function’s code.

CalculateSampleCovariance <- function(x, y, verbose = TRUE) { ## Computes the sample covariance between two vectors. # ## Args: ## x: One of two vectors whose sample covariance is to be calculated. ## y: The other vector. x and y must have the same length, greater than one, ## with no missing values. ## verbose: If TRUE, prints sample covariance; if not, not. Default is TRUE. # ## Returns: ## The sample covariance between x and y. n <- length(x) ## Error handling if (n <= 1 || n != length(y)) { stop("Arguments x and y have different lengths: ", length(x), " and ", length(y), ".") } if (TRUE %in% is.na(x) || TRUE %in% is.na(y)) { stop(" Arguments x and y must not have missing values.") } covariance <- var(x, y) if (verbose) cat("Covariance = ", round(covariance, 4), ".\n", sep = "") return(covariance) }

- Functions in installed packages

library(forecast) help.search("auto.arima") ??auto.arima

- Functions in other CRAN packages

library(sos) findFn("arima") RSiteSearch("arima")

- Type
`?sort`

for the usage of the function`sort()`

. - Typing the name of a function gives its definition.
- Type
`forecast:::estmodel`

for hidden functions. - Download the tar.gz file from CRAN if you want to see any underlying
**C**or**Fortran**code.

- Every paper, book or scientific report is a ‘project’.
- Every project has its own folder and R workspace.
- Every project is entirely scripted. That is, all analysis, graphs and tables must be able to be generated by running one script.
- This script sources all other R files in the correct order and yields all the required results. This script could be in
`main.R`

or`main.Rmd`

. `functions.R`

contains all non-packaged functions used in the project.- each function can not be too long.

- This script sources all other R files in the correct order and yields all the required results. This script could be in

- For programming questions: StackOverflow.com
- For statistical questions: CrossValidated.com

- RStudio blog: blog.rstudio.org
- R-bloggers: www.r-bloggers.com
- It takes time to develop your own style. Once it is developed, it is really hard to be changed. So please be careful at the beginning.

Write a function to solve the roots of given quadratic equation \(ax^2 + bx + c = 0\) with \(a\), \(b\) and \(c\) as input arguments.

Test your function on some simple equations.

Keep in mind the styles we have learnt.

- Write two messy
`.R`

files and put them in a folder. - Use
`tidy_dir()`

to make them tidy.