Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Getting started with R - QCBS

Corey Chivers
March 22, 2013
600

Getting started with R - QCBS

Corey Chivers

March 22, 2013
Tweet

Transcript

  1. Introduction to R
    Corey Chivers
    Department of Biology, McGill University
    zerotorhero.wordpress.com

    View Slide

  2. R U Keen?
    Get started with:
    http://www.codeschool.com/courses/try-r

    View Slide

  3. All of the slides, scripts, and data files
    we will be using are available at:
    zerotorhero.wordpress.com

    View Slide

  4. #02Rhero

    View Slide

  5. Outline
    1) Getting started
    R command line
    Assingment, Data & vectors
    Using functions
    Getting help
    2) Data and projects in R-
    Studio
    Creating an R project
    Organizing/cleaning data
    Importing/exporting Data
    3) Pizza!
    4) Intro to Plotting
    Reproducible plots
    Basic plotting
    Building complex plots
    by small steps
    5) A la Carte
    (time permitting):
    Programming (loops,
    conditionals & functions)
    Intro to Simulation
    Q & A

    View Slide

  6. Why R ?
    • It's Free ..
    • “as in free beer”
    • “as in free speech”
    • use it for any purpose.
    • give copies to your friends &
    neighbours.
    • improve it and release improvements
    publicly.

    View Slide

  7. View Slide

  8. Tables
    Data
    Graphs
    Statistics
    Understanding
    Sigmaplot
    Excel
    SAS

    View Slide

  9. Tables
    Data
    Graphs
    Statistics
    Understanding

    View Slide

  10. Why does R
    seem so hard to learn?

    R is command-driven

    R will not tell you what to do, nor guide you
    through the steps of an analysis or method.

    R will do all the calculations for you, and
    it will do exactly what you tell it (not necessarily
    what you want).

    R has the flexibility and power to do exactly what
    you want, exactly how you want it done.

    View Slide

  11. Challenges
    Throughout the workshop, you will
    be presented with a series of
    challenges.
    Collaborate with your neighbour
    when the going gets tough!

    View Slide

  12. Challenge 1
    Open R-Studio

    View Slide

  13. The Console

    View Slide

  14. Output (results)
    Input (commands)
    The R Console
    Text in the R console typically looks like this:
    > input
    [1] output
    I'll represent it like this:

    View Slide

  15. [1] 2
    1 + 1
    R is a calculator
    2 * 2
    [1] 4
    2 ^ 3
    [1] 8
    10 - 1
    [1] 9
    8 / 2
    [1] 4
    sqrt(9)
    [1] 3
    •Commands are evaluated, and the result is
    returned (sometimes invisibly).

    View Slide

  16. Challenge
    Use R to answer the following skill
    testing question:
    2 + 16 x 24 – 56 / (2+1) – 457
    Bonus – calculate:
    The area of a circle with radius 5cm?
    The hypotenuse of triangle ABC with:
    • Angle ABC=90⁰ , Angle ACB=45⁰
    • Side AB=5cm

    View Slide

  17. R command-line
    tip
    • Use the ▲▼ arrow
    keys to re-produce
    previous commands
    • This lets you scroll
    through your
    command history

    View Slide

  18. Hey look, a
    suggestion!

    View Slide

  19. • Some plots and graphs
    that can be made using R
    • images and other
    graphics made using R
    • a demonstration of linear
    modelling & GLMs
    • a list of available demos
    demo(graphics)
    demo(image)
    demo(lm.glm)
    demo()
    R is a show-off
    For even more demos, use:
    demo(package =
    .packages(all.available = TRUE))

    View Slide

  20. Objects
    You can store values as named objects using the
    assignment operator:
    on the left
    A B A_log B.seq Object names can be be (almost)
    anything you choose. They can include:
    Letters a-z, A-Z (case sensitive)
    Numbers 0-9
    Periods .
    Underscores _
    Should begin with a letter

    View Slide

  21. Retrieving the values
    When a variable name is evaluated, it returns the
    stored value.
    A
    [1] 10
    B
    [1] 100
    A_log
    [1] 2.302585
    x
    [1] 3
    B.seq
    [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
    [22] 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
    [43] 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
    [64] 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
    [85] 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100

    View Slide

  22. Challenge
    Put your answer to the skill testing
    question into an object with a name of
    your choice.

    View Slide

  23. Vectors

    The most basic kind of object in R is a vector

    Think of a vector as a list of related values (data)

    A single value is an 'atomic vector' (vector of
    length 1)
    [1] 2
    1:10
    index
    (item
    number)
    Value
    (result)
    [1] 1 2 3 4 5 6 7 8 9 10

    View Slide

  24. Vectors
    my_fav_numsplot(1:5, my_fav_nums)
    my_fav_nums[3]
    [1] 10

    You can make a vector using the c() command:

    You can plot vectors:

    You can access a part of a vector by its index:

    View Slide

  25. Vectors
    my_fav_nums+20
    my_fav_nums/2 +1
    sqrt(my_fav_nums)
    mean(my_fav_nums)
    sum(my_fav_nums)

    You can use vectors in calculations:

    View Slide

  26. Challenge
    What is the sum of the square of all of the
    integers between 1 and 100?
    Hint: remember counting from x to y can be
    done with x:y.

    View Slide

  27. R command-line
    tip
    •Use the Tab key to auto-
    complete
    •This helps you avoid
    spelling errors and speeds
    up command entering.

    View Slide

  28. Functions

    A functions takes in arguments and returns an
    object

    To use a function (call), the command must be
    structured properly, following the "grammar
    rules" of the R language (syntax)
    log( 8 , base = 2 )

    View Slide

  29. Functions

    A functions takes in arguments and returns an
    object

    To use a function (call), the command must be
    structured properly, following the "grammar
    rules" of the R language (syntax)
    log( 8 , base = 2 )
    Function name

    View Slide

  30. Functions

    A functions takes in arguments and returns an
    object

    To use a function (call), the command must be
    structured properly, following the "grammar
    rules" of the R language (syntax)
    log( 8 , base = 2 )
    Function name
    No space Parentheses

    View Slide

  31. Functions

    A functions takes in arguments and returns an
    object

    To use a function (call), the command must be
    structured properly, following the "grammar
    rules" of the R language (syntax)
    log( 8 , base = 2 )
    Function name
    No space Parentheses
    Argument 2
    Arguments are
    separated by a
    Comma
    Argument 1

    View Slide

  32. Arguments

    Arguments are the values passed to a function
    when it is called

    Arguments are values and instructions the
    function needs to do its thing
    xyplot(x,y,type=‘l’)
    Example

    View Slide

  33. Some common functions

    View Slide

  34. How do I
    use a new
    function?
    What arguments
    will it take?
    Use
    ?function !
    What does
    it do? For example:
    ?seq

    View Slide

  35. View Slide

  36. Function
    name
    package
    Arguments
    you can
    pass to the
    function
    Detailed
    information
    about the
    function
    Any argument
    with an = has a
    default value

    View Slide

  37. ...details
    Example
    use cases.
    Copy and
    past to try it
    out.
    ...details
    The value
    which will
    be returned
    You can also use
    example(seq) to
    run all of the code
    in this section.

    View Slide

  38. Challenge
    1) Create an unsorted vector of
    your favourite numbers.
    2) Find out how to sort it using
    ?sort.
    3) Sort your vector in forward
    and in reverse order.
    4) Put your sorted vectors into
    new objects.

    View Slide

  39. load a built-in data file
    peek at first few rows
    structure of the object
    names of items in the object
    attributes of the object
    summary statistics
    plot of all variable combinations
    data(CO2)
    head(CO2)
    str(CO2)
    names(CO2)
    attributes(CO2)
    summary(CO2)
    plot(CO2)
    Data Frames

    View Slide

  40. •You can refer to parts of a data frame
    object by its index or name (if it has one)
    CO2$Treatment
    Indexing
    CO2[1:6,3]
    Object
    name
    Rows
    (dim. 1)
    columns
    (dim. 2)
    Object
    name
    Column
    operator
    Column
    name

    View Slide

  41. Indexing
    names(CO2)
    CO2$Treatment
    CO2[,3]
    CO2[3,]
    CO2[1:6,]
    CO2[c(1,2,3,4,5,6),3]
    CO2$Treatment[1:6]
    CO2[CO2$conc>100,]
    CO2[CO2$Treatment=="chilled",]
    CO2[sample(nrow(CO2), 10),]
    available named columns
    "Treatment" column
    all rows, column 3
    row 3, all columns
    rows 1-6, all columns
    rows 1-6, column 3
    elements 1-6 of Treatment
    rows where conc > 100
    rows where Treatment == “chilled"
    10 random rows

    View Slide

  42. Challenge
    1) What is the mean uptake of
    all plants in the non-chilled
    treatment?
    2) What is the variance in
    uptake for plant ‘Mc3’?

    View Slide

  43. 43
    Installing packages
    • In addition to all of the base
    functions in R, you can install
    additional packages to do
    specialized statistics and plotting.
    • Currently, the CRAN package
    repository features 4276 available
    packages.
    • http://cran.r-project.org/web/packages/

    View Slide

  44. The library() function loads the package,
    making its functions accessible.
    install.packages(‘ggplot2’)
    Installing packages
    library(ggplot2)

    View Slide

  45. View Slide