190

Getting started with R

Corey Chivers

February 01, 2013

Transcript

2. 2 Why R ? • It's Free .. • “as

in free beer” • “as in free speech” • use it for any purpose. • give copies to your friends & neighbours. • improve it and release improvements publicly.

5. 5 Why is R so hard to learn? • R

is command-driven • R will not tell you what to do, or guide you through the steps of an analysis or method. • R will do all the calculations for you, and it will do exactly what you tell it (not necessarily what you want). • R has the flexibility and power to do exactly what you want, exactly how you want it done.
6. 6 Learning Objectives • Open R(studio) for the first time

• Navigate the R(studio) interface • Enter commands • input & output • common functions • Control stuctures • Use technical terms for R concepts • Get Help
7. 7 Challenges • Throughout the workshop, you will be presented

with a series of challenges. • Collaborate with your neighbour when the going gets tough!

10. Output (results) Input (commands) The R Console 10 Text in

the R console typically looks like: > Input (commands)  output I will represent these as:
11.  2 1 + 1 R is a calculator 11

2 * 2  4 2 ^ 3  8 10 - 1  9 8 / 2  4 sqrt(9)  3 • Expressions are evaluated, and the result is returned (sometimes invisibly).
12. Challenge • Use R to answer the following skill testing

question: 2 + 16 x 24 – 56 / (2+1) - 457
13. 13 R command-line tip • Use the ▲▼ arrow keys

to re-produce previous commands • This lets you scroll through your command history
14. • You can store values (objects) in symbolic variables (names)

using an assignment operator • Variable names can include: • letters a-z A-Z • numbers 0-9 • periods . • underscores _ • Variable names should begin with a letter A <- 10 B <- 10*10 A_log <- log(A) B.seq <- 1:B <- assign the value on the right to the name on the left Objects

an object with a name of your choice.
16. • When a variable name is evaluated, it returns the

stored value. A  10 Retrieve values B  100 A_log  2.302585 x  3 B.seq  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21  22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42  43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63  64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84  85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100
17. • The most basic kind of object in R is

a vector • Think of a vector as a list of related values (data) • A single value is an "atomic vector" (vector with a length of 1)  2 1:10 Vectors index: the item number value (result)  1 2 3 4 5 6 7 8 9 10
18. • You can make a vector using the c() command:

• Vectors can be used in a plot. • You can access an element of a vector by its index my_fav_nums<-c(1, 4, 10, 444, 42) Vectors plot(1:5, my_fav_nums) my_fav_nums  10
19. • Vectors can be used in calculations • Operations are

executed on each item my_fav_nums+20 my_fav_nums/2 +1 sqrt(my_fav_nums) mean(my_fav_nums) sum(my_fav_nums) Vectors
20. R command-line tip • Use the Tab key to auto-

complete • This helps you avoid spelling errors and speeds up command entering.
21.  15 A +5 Use variables in calculations  20.76125

22.22222 22.26562 24.93075 32.87197 19.94460 B/A Weight <- c(60 , 72 , 57 , 90 , 95 , 72 ) Height <- c(1.7, 1.8, 1.6, 1.9, 1.7, 1.9) BMI <- Weight/Height^2 BMI  10 plot(Height,Weight)
22. Challenge What is the sum of the square of all

of the integers between 1 and 100. Hint: remember counting from x to y can be done with x:y.
23. class(1.23) class('hello') class("1.23") class(FALSE) typeof(1.23) typeof(1:10) as.character(c(1,2,NA,3)) Some other data

types • character (string) • in single ' or double " quotes. > 'hello world' > "1.23" • logical • TRUE or FALSE converting from one type to another = "coercion"
24. • Some words and letters already have values in R

and should never be used as variable names pi  3.141593 Built-in variables letters  "a" "b" "c" "d" "e" "f" "g" "h" "i" "j" "k" "l" "m"  "n" "o" "p" "q" "r" "s" "t" "u" "v" "w" "x" "y" "z" LETTERS  "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M"  "N" "O" "P" "Q" "R" "S" "T" "U" "V" "W" "X" "Y" "Z"
25. 25 •Some words and letters already have special meaning in

the R language (keywords) and should never be used as variable names Reserved words NA "Not Available" (unknown or missing data) NaN "Not a Number" (undefined numeric values) NULL a special object (missing objects) Inf Infiniti TRUE Logical value FALSE Logical value T short for TRUE F short for FALSE c,q,t,C,D,I R functions diff, df, pt R functions
26. ls() List all variables you have created rm(x) Remove the

variable ‘x’ from memory rm(list=ls()) Remove all variables from memory (clear memory) Housekeeping
27. 27 Comparisons • Comparison of 2 values results in logical

values: TRUE or FALSE == "equal": Note the two equals signs. Not to be confused with a single equals sign (used to assign values). != "not equal" > "greater than" < "less than" >= "greater than or equal to" <= "less than or equal to"

to π?
29. • When R is given a command it does not

understand, or cannot execute, it outputs an error to the console. Error in 1 + "2" : non-numeric argument to binary operator Fail <- 1 + "2" Errors Error: object 'fail' not found Fail
30. • If a command does not work exactly as R

(or the developers) think is "ideal", it may produce a warning instead. • Use the warnings() command to review them. Warning message:In log(-1) : NaNs produced oops <- log(-1) Warnings
31. • Takes in arguments and returns a value. • To

use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name
32. • Takes in arguments and returns a value. • To

use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name parentheses no space
33. • Takes in arguments and returns a value. • To

use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name argument 1 parentheses no space
34. • Takes in arguments and returns a value. • To

use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name argument 1 comma parentheses no space
35. • Takes in arguments and returns a value. • To

use a function (call), the command must be structured properly, following the "grammar rules" of the R language (syntax) log( 8 , base = 2 ) Functions function name argument 1 comma parentheses argument 2 no space
36. 36 • Arguments are the values passed to a function

when it is called • Arguments are values and instructions the function needs to do its thing Ex: Arguments x<-1:10 y<-sin(x) plot(x,y,type=‘l’) arguments
37. Common & useful functions 37 sqrt log exp min max

sum mean sd var summary plot par paste format head length str names typeof class attributes library ls rm setwd getwd file.choose (Mac) choose.file (PC) c seq rep tapply aggregate merge cbind rbind unique help ? help.search ?? help.start
38. How do I use a new function? What arguments will

it take? Use ?function ! What does it do? For example: ?seq
39. function name package long name (title) arguments Details on how

the function works
40. Publications that describe the function (relevant theory & concepts) copy

& paste Examples into the console to see the function in action. Try to modify the example code to do what you want. Va lu e returned You can also use example(seq) in the console to run all the example code in this section. Details
41. Challenge 1) Create an unsorted vector of your favourite numbers.

2) Find out how to sort it using ?sort. 3) Sort your vector in forward and in reverse order. 4) Put your sorted vectors into new objects.

43. 43 HELP Web Sites • R web site: r-project.org •

Start here: especially the Documentation section • r-bloggers.com • stackoverflow.com • Google search ...
44. 44 Object types an object is a way of packaging

information in R vector a combination of values, of the same type. list a combination of different types of values (or even objects) data frame a collection of vectors of the same length (# rows) columns = “variables” ; rows = “cases”
45. load a built-in data file peek at first few rows

structure of the object names of items in the object attributes of the object summary statistics plot of all variable combinations data(CO2) head(CO2) str(CO2) names(CO2) attributes(CO2) summary(CO2) plot(CO2) Working with a data frame
46. • You can refer to parts of an data frame

object by their index or name (if they have one) CO2\$Treatment Indexing CO2[1:6,3] object name r ow s (dim. 1) colu m n s (dim. 2) object name \$ operator column name
47. Indexing names(CO2) CO2\$Treatment CO2[,3] CO2[3,] CO2[1:6,] CO2[c(1,2,3,4,5,6),3] CO2\$Treatment[1:6] CO2[CO2\$conc>100,] CO2[CO2\$Treatment=="chilled",]

CO2[sample(nrow(CO2), 10),] available names "Treatment" column all rows, column 3 row 3, all columns rows 1-6, all columns rows 1-6, column 3 elements 1-6 of Treatment rows where conc > 100 rows where Treatment == “chilled" 10 random rows
48. Challenge 1) What is the mean uptake of all plants

in the non-chilled treatment? 2) What is the variance in uptake for plant ‘Mc3’?
49. • R is a full fledged programming language. It can

do loops and conditionals. i <- runif(1,0,100) if(i <= 50) { print(‘i is pretty small’) } Control Structures for(i in 1:100) { print(i) }
50. Challenge 1) Print out all the numbers between 1 and

100, but if the number is a multiple of 3, print ‘fizz’ instead. 2) Extend your code so that for multiples of 5, it prints out ‘buzz’. HINT: To get the remainder of integer division, use %%. Ex: 4 9%%5
51. 51 Installing packages • In addition to all of the

base functions in R, you can install additional packages to do specialized statistics and plotting. • Currently, the CRAN package repository features 4276 available packages. • http://cran.r-project.org/web/packages/
52. The library() function loads the package, making its functions accessible.

install.packages(‘ggplot2’) Installing packages library(ggplot2)
53. demo(graphics) demo(image) demo(lm.glm) demo() R is a show-off 53 •

Some plots and graphs that can be made using R • images and other graphics made using R • a demonstration of linear modelling & GLMs • a list of available demos