Upgrade to PRO for Only $50/Year—Limited-Time Offer! 🔥

plotting data in the tidyverse with ggplot2

harp
October 15, 2019

plotting data in the tidyverse with ggplot2

harp

October 15, 2019
Tweet

More Decks by harp

Other Decks in Education

Transcript

  1. What is the tidyverse? The tidyverse is an opinionated collection

    of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures. https://www.tidyverse.org/
  2. Why learn the tidyverse? Unlike base R, the design philosophy

    provides a consistent syntax: The first argument is always the data, which means we can use the pipe: %>% All functions are designed with a consistent grammar in mind with verbs, nouns, adverbs and pronouns. The design of harp tries to follow and make use of the tidyverse where possible.
  3. Which tidyverse packages will we use? readr: read and write

    data dplyr: a grammar of data manipulation ggplot2: a grammar of graphics tidyr: tidy "messy" data lubridate: work with date - time data
  4. ggplot "ggplot2 is a system for declaratively creating graphics, based

    on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details."
  5. ggplot "ggplot2 is a system for declaratively creating graphics, based

    on The Grammar of Graphics. You provide the data, tell ggplot2 how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details." You would begin by giving it a data frame (or tibble!), telling it which columns to map to which aesthetics and then add on layers, scales, facet specifications, coordinate systems and themes.
  6. ggplot - aesthetic mapping What is an aesthetic? An aesthetic

    is some feature of the plot controlled by the data. This could be x-position, y-position, colour, fill colour, point shape, line style, transparency (alpha), group… How do we map aesthetics? ggplot(data, aes(x = date, y = temperature, colour = station))
  7. date temperature station 2019-10-15 22 Cape Town 2019-10-16 19 Cape

    Town 2019-10-17 24 Cape Town 2019-10-15 12 Copenhagen 2019-10-16 13 Copenhagen 2019-10-17 9 Copenhagen ggplot - aesthetic mapping ggplot(data, aes(x = date, y = temperature, colour = station))
  8. ggplot - adding graphics layers What are graphics layers? Once

    we've mapped the data to aesthetics, we then need to tell ggplot how to add that information to the plot. This is done by adding geometries, or 'geoms'. For the previous plot: ggplot(data, aes(x = date, y = temperature, colour = station)) + geom_line()
  9. ggplot - adding scales Scales can be applied to: ✓

    colour e.g. scale_colour_manual ✓ fill e.g. scale_fill_gradient2 ✓ axes e.g. scale_y_continuous ✓ shape e.g. scale_shape_discrete ✓ size e.g. scale_size_continuous ✓ linetype e.g. scale_linetype_manual
  10. ggplot - facets What are facets? Using facets, you can

    map variables (or groups) to different panels in a plot. There are two main methods to achieve this: • facet_wrap - wraps the panels based on the value of the faceting variable • facet_grid - for 2 variables, maps the panels to a matrix with one variable for columns and the other for rows
  11. ggplot - facets Returning to our original example, you plot

    the temperature time series in a panel for each station. date temperature station 2019-10-15 22 Cape Town 2019-10-16 19 Cape Town 2019-10-17 24 Cape Town 2019-10-15 12 Copenhagen 2019-10-16 13 Copenhagen 2019-10-17 9 Copenhagen ggplot(data, aes(x = date, y = temperature)) + geom_line() + facet_wrap(vars(station), ncol = 1)
  12. ggplot - coordinate systems Coordinate systems have to main roles:

    1. Combine the two position aesthetics to produce a 2d position on the plot. 2. In coordination with the faceter, coordinate systems draw axes and panel backgrounds. We are mostly working in cartesian coordinates, so will concentrate on those, other systems exist too, such as polar coordinates and map coordinates.
  13. ggplot - themes The theme system in ggplot controls non

    data elements of your plot. There is an enormous number of things that you can control, so it is best to head to the documentation. You can control text sizes, fonts and colours, legend position, background colours, axis lines, axis tick marks and more. There are also a number of complete themes built into ggplot, or available in other packages, such as ggthemes.
  14. ggplot - labeling For labeling, you can add labels for

    • x and y axes • Title • Subtitle • Caption • Legends This is especially useful when you apply functions to your data before plotting and don't like the axis labels.