Slide 1

Slide 1 text

http://zerotorhero.wordpress.com/

Slide 2

Slide 2 text

2 Why R ? • It's Free .. • “as in free beer” • “as in free speech” • use it for any purpose. • give copies to your friends & neighbours. • improve it and release improvements publicly.

Slide 3

Slide 3 text

Tables Data Graphs Statistics Understanding Sigmaplot Excel SAS

Slide 4

Slide 4 text

Tables Data Graphs Statistics Understanding

Slide 5

Slide 5 text

5 Why is R so hard to learn? • R is command-driven • R will not tell you what to do, or guide you through the steps of an analysis or method. • R will do all the calculations for you, and it will do exactly what you tell it (not necessarily what you want). • R has the flexibility and power to do exactly what you want, exactly how you want it done.

Slide 6

Slide 6 text

6 Learning Objectives • Open R(studio) for the first time • Navigate the R(studio) interface • Enter commands • input & output • common functions • Control stuctures • Use technical terms for R concepts • Get Help

Slide 7

Slide 7 text

7 Challenges • Throughout the workshop, you will be presented with a series of challenges. • Collaborate with your neighbour when the going gets tough!

Slide 8

Slide 8 text

8 Challenge 1 Open R-Studio

Slide 9

Slide 9 text

9 The Console

Slide 10

Slide 10 text

Output (results) Input (commands) The R Console 10 Text in the R console typically looks like: > Input (commands) [1] output I will represent these as:

Slide 11

Slide 11 text

[1] 2 1 + 1 R is a calculator 11 2 * 2 [1] 4 2 ^ 3 [1] 8 10 - 1 [1] 9 8 / 2 [1] 4 sqrt(9) [1] 3 • Expressions are evaluated, and the result is returned (sometimes invisibly).

Slide 12

Slide 12 text

Challenge • Use R to answer the following skill testing question: 2 + 16 x 24 – 56 / (2+1) - 457

Slide 13

Slide 13 text

13 R command-line tip • Use the ▲▼ arrow keys to re-produce previous commands • This lets you scroll through your command history

Slide 14

Slide 14 text

• You can store values (objects) in symbolic variables (names) using an assignment operator • Variable names can include: • letters a-z A-Z • numbers 0-9 • periods . • underscores _ • Variable names should begin with a letter A <- 10 B <- 10*10 A_log <- log(A) B.seq <- 1:B <- assign the value on the right to the name on the left Objects

Slide 15

Slide 15 text

Challenge Put your answer to the skill testing question into an object with a name of your choice.

Slide 16

Slide 16 text

• When a variable name is evaluated, it returns the stored value. A [1] 10 Retrieve values B [1] 100 A_log [1] 2.302585 x [1] 3 B.seq [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 [22] 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 [43] 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 [64] 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 [85] 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

Slide 17

Slide 17 text

• The most basic kind of object in R is a vector • Think of a vector as a list of related values (data) • A single value is an "atomic vector" (vector with a length of 1) [1] 2 1:10 Vectors index: the item number value (result) [1] 1 2 3 4 5 6 7 8 9 10

Slide 18

Slide 18 text

• You can make a vector using the c() command: • Vectors can be used in a plot. • You can access an element of a vector by its index my_fav_nums<-c(1, 4, 10, 444, 42) Vectors plot(1:5, my_fav_nums) my_fav_nums[3] [1] 10

Slide 19

Slide 19 text

• Vectors can be used in calculations • Operations are executed on each item my_fav_nums+20 my_fav_nums/2 +1 sqrt(my_fav_nums) mean(my_fav_nums) sum(my_fav_nums) Vectors

Slide 20

Slide 20 text

R command-line tip • Use the Tab key to auto- complete • This helps you avoid spelling errors and speeds up command entering.

Slide 21

Slide 21 text

[1] 15 A +5 Use variables in calculations [1] 20.76125 22.22222 22.26562 24.93075 32.87197 19.94460 B/A Weight <- c(60 , 72 , 57 , 90 , 95 , 72 ) Height <- c(1.7, 1.8, 1.6, 1.9, 1.7, 1.9) BMI <- Weight/Height^2 BMI [1] 10 plot(Height,Weight)

Slide 22

Slide 22 text

Challenge What is the sum of the square of all of the integers between 1 and 100. Hint: remember counting from x to y can be done with x:y.

Slide 23

Slide 23 text

class(1.23) class('hello') class("1.23") class(FALSE) typeof(1.23) typeof(1:10) as.character(c(1,2,NA,3)) Some other data types • character (string) • in single ' or double " quotes. > 'hello world' > "1.23" • logical • TRUE or FALSE converting from one type to another = "coercion"

Slide 24

Slide 24 text

• Some words and letters already have values in R and should never be used as variable names pi [1] 3.141593 Built-in variables letters [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m" [14] "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" LETTERS [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" [14] "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"

Slide 25

Slide 25 text

25 •Some words and letters already have special meaning in the R language (keywords) and should never be used as variable names Reserved words NA "Not Available" (unknown or missing data) NaN "Not a Number" (undefined numeric values) NULL a special object (missing objects) Inf Infiniti TRUE Logical value FALSE Logical value T short for TRUE F short for FALSE c,q,t,C,D,I R functions diff, df, pt R functions

Slide 26

Slide 26 text

ls() List all variables you have created rm(x) Remove the variable ‘x’ from memory rm(list=ls()) Remove all variables from memory (clear memory) Housekeeping

Slide 27

Slide 27 text

27 Comparisons • Comparison of 2 values results in logical values: TRUE or FALSE == "equal": Note the two equals signs. Not to be confused with a single equals sign (used to assign values). != "not equal" > "greater than" < "less than" >= "greater than or equal to" <= "less than or equal to"

Slide 28

Slide 28 text

28 Challenge Is 3.1415929 greater than, less than, or equal to π?

Slide 29

Slide 29 text

• When R is given a command it does not understand, or cannot execute, it outputs an error to the console. Error in 1 + "2" : non-numeric argument to binary operator Fail <- 1 + "2" Errors Error: object 'fail' not found Fail

Slide 30

Slide 30 text

• If a command does not work exactly as R (or the developers) think is "ideal", it may produce a warning instead. • Use the warnings() command to review them. Warning message:In log(-1) : NaNs produced oops <- log(-1) Warnings

Slide 31

Slide 31 text

• Takes in arguments and returns a value. • To use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name

Slide 32

Slide 32 text

• Takes in arguments and returns a value. • To use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name parentheses no space

Slide 33

Slide 33 text

• Takes in arguments and returns a value. • To use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name argument 1 parentheses no space

Slide 34

Slide 34 text

• Takes in arguments and returns a value. • To use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name argument 1 comma parentheses no space

Slide 35

Slide 35 text

• Takes in arguments and returns a value. • To use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name argument 1 comma parentheses argument 2 no space

Slide 36

Slide 36 text

36 • Arguments are the values passed to a function when it is called • Arguments are values and instructions the function needs to do its thing Ex: Arguments x<-1:10 y<-sin(x) plot(x,y,type=‘l’) arguments

Slide 37

Slide 37 text

Common & useful functions 37 sqrt log exp min max sum mean sd var summary plot par paste format head length str names typeof class attributes library ls rm setwd getwd file.choose (Mac) choose.file (PC) c seq rep tapply aggregate merge cbind rbind unique help ? help.search ?? help.start

Slide 38

Slide 38 text

How do I use a new function? What arguments will it take? Use ?function ! What does it do? For example: ?seq

Slide 39

Slide 39 text

function name package long name (title) arguments Details on how the function works

Slide 40

Slide 40 text

Publications that describe the function (relevant theory & concepts) copy & paste Examples into the console to see the function in action. Try to modify the example code to do what you want. Va lu e returned You can also use example(seq) in the console to run all the example code in this section. Details

Slide 41

Slide 41 text

Challenge 1) Create an unsorted vector of your favourite numbers. 2) Find out how to sort it using ?sort. 3) Sort your vector in forward and in reverse order. 4) Put your sorted vectors into new objects.

Slide 42

Slide 42 text

42 HELP Books

Slide 43

Slide 43 text

43 HELP Web Sites • R web site: r-project.org • Start here: especially the Documentation section • r-bloggers.com • stackoverflow.com • Google search ...

Slide 44

Slide 44 text

44 Object types an object is a way of packaging information in R vector a combination of values, of the same type. list a combination of different types of values (or even objects) data frame a collection of vectors of the same length (# rows) columns = “variables” ; rows = “cases”

Slide 45

Slide 45 text

load a built-in data file peek at first few rows structure of the object names of items in the object attributes of the object summary statistics plot of all variable combinations data(CO2) head(CO2) str(CO2) names(CO2) attributes(CO2) summary(CO2) plot(CO2) Working with a data frame

Slide 46

Slide 46 text

• You can refer to parts of an data frame object by their index or name (if they have one) CO2$Treatment Indexing CO2[1:6,3] object name r ow s (dim. 1) colu m n s (dim. 2) object name $ operator column name

Slide 47

Slide 47 text

Indexing names(CO2) CO2$Treatment CO2[,3] CO2[3,] CO2[1:6,] CO2[c(1,2,3,4,5,6),3] CO2$Treatment[1:6] CO2[CO2$conc>100,] CO2[CO2$Treatment=="chilled",] CO2[sample(nrow(CO2), 10),] available names "Treatment" column all rows, column 3 row 3, all columns rows 1-6, all columns rows 1-6, column 3 elements 1-6 of Treatment rows where conc > 100 rows where Treatment == “chilled" 10 random rows

Slide 48

Slide 48 text

Challenge 1) What is the mean uptake of all plants in the non-chilled treatment? 2) What is the variance in uptake for plant ‘Mc3’?

Slide 49

Slide 49 text

• R is a full fledged programming language. It can do loops and conditionals. i <- runif(1,0,100) if(i <= 50) { print(‘i is pretty small’) } Control Structures for(i in 1:100) { print(i) }

Slide 50

Slide 50 text

Challenge 1) Print out all the numbers between 1 and 100, but if the number is a multiple of 3, print ‘fizz’ instead. 2) Extend your code so that for multiples of 5, it prints out ‘buzz’. HINT: To get the remainder of integer division, use %%. Ex: 4 9%%5

Slide 51

Slide 51 text

51 Installing packages • In addition to all of the base functions in R, you can install additional packages to do specialized statistics and plotting. • Currently, the CRAN package repository features 4276 available packages. • http://cran.r-project.org/web/packages/

Slide 52

Slide 52 text

The library() function loads the package, making its functions accessible. install.packages(‘ggplot2’) Installing packages library(ggplot2)

Slide 53

Slide 53 text

demo(graphics) demo(image) demo(lm.glm) demo() R is a show-off 53 • Some plots and graphs that can be made using R • images and other graphics made using R • a demonstration of linear modelling & GLMs • a list of available demos

Slide 54

Slide 54 text

• 2 players • Start with • Take turns using the variable ‘x’ as an argument in a function or expression • Assign the result to the same variable ‘x’ • How long can you keep the chain going without getting errors? x <- 0 Let’s play “Command-R” 54

Slide 55

Slide 55 text

• Challenges • Change the object type of x into a : • vector of multiple items • data frame • Use x in a graph / plot x <- x + 1 x <- x * (x+10) x <- exp(x) x <- 1:x x <- seq(from=x, to=100, by=2) x <- rnorm(x) x <- x[1:3] x <- x[2] x <- data.frame( foo = rnorm(length(x)), x) “Command-R” 55