ggplot(mpg, aes(displ, hwy)) +
geom_point(aes(color = class)) +
geom_smooth(se = FALSE, method = "loess") +
labs(
title = "Fuel efficiency generally ...",
subtitle = "Two seaters (sports cars) ...",
caption = "Data from fueleconomy.gov"
)
Accessed with the labs() function
Slide 7
Slide 7 text
2
Axes
Slide 8
Slide 8 text
No content
Slide 9
Slide 9 text
Stages of visualisation system popularity
1. Someone used it and complained about a bug
2. Someone used it in an academic paper
3. Someone used it in a newspaper
4.Someone used it to commit academic fraud
5. So many people use it that google has autocompletes
for bad graphics ideas
Slide 10
Slide 10 text
No content
Slide 11
Slide 11 text
No content
Slide 12
Slide 12 text
No content
Slide 13
Slide 13 text
Isenberg, Petra, et al. "A study on dual-scale data charts." IEEE Transactions on
Visualization and Computer Graphics 17.12 (2011): 2469-2478.
https://www.lri.fr/~isenberg/publications/papers/Isenberg_2011_ASO.pdf
Two difference between a factor and a string:
1.Fixed set of possible values
2.Arbitrary order
Slide 25
Slide 25 text
relig <- gss_cat %>%
group_by(relig) %>%
summarise(
tvhours = mean(tvhours, na.rm = TRUE),
n = n()
)
Some data from the general social survey
Slide 26
Slide 26 text
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
No answer
Don't know
Inter−nondenominational
Native american
Christian
Orthodox−christian
Moslem/islam
Other eastern
Hinduism
Buddhism
Other
None
Jewish
Catholic
Protestant
2 3 4
tvhours
relig
Slide 27
Slide 27 text
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
Other eastern
Hinduism
Buddhism
Orthodox−christian
Moslem/islam
Jewish
None
No answer
Other
Christian
Inter−nondenominational
Catholic
Protestant
Native american
Don't know
2 3 4
tvhours
fct_reorder(relig, tvhours)
Slide 28
Slide 28 text
by_age <- gss_cat %>%
filter(!is.na(age)) %>%
group_by(age, marital) %>%
count() %>%
mutate(prop = n / sum(n))
You have the same problem with more dimensions
Slide 29
Slide 29 text
0.00
0.25
0.50
0.75
1.00
20 40 60 80
age
prop
marital
No answer
Never married
Separated
Divorced
Widowed
Married
Slide 30
Slide 30 text
0.00
0.25
0.50
0.75
1.00
20 40 60 80
age
prop
marital
Widowed
Married
Divorced
Never married
No answer
Separated
Slide 31
Slide 31 text
5
Missing
values
Slide 32
Slide 32 text
An explicit missing value (NA)
is the presence of an absence;
an implicit missing value is the
absence of a presence.
Slide 33
Slide 33 text
Demo
Slide 34
Slide 34 text
6
Histograms
Slide 35
Slide 35 text
hist(1:4)
Slide 36
Slide 36 text
df <- tibble(x = 1:4)
df %>%
ggplot(aes(x)) +
geom_histogram(binwidth = 1)
Equivalent ggplot2 code is a little longer
0
20
40
60
2seater compact midsize minivan pickup subcompact suv
class
count
ggplot(mpg, aes(class)) +
geom_bar(colour = "white")
Slide 44
Slide 44 text
0
20
40
60
2seater compact midsize minivan pickup subcompact suv
class
count
ggplot(mpg, aes(class, group = id)) +
geom_bar(col = "white")
Slide 45
Slide 45 text
0
20
40
60
2seater compact midsize minivan pickup subcompact suv
class
count
drv
4
f
r
ggplot(mpg, aes(class, group = id, fill = drv)) +
geom_bar(col = "white")
Slide 46
Slide 46 text
0
20
40
60
2seater compact midsize minivan pickup subcompact suv
class
count
drv
4
f
r
ggplot(mpg, aes(class, fill = drv)) +
geom_bar(col = "white")
Slide 47
Slide 47 text
class_mpg <- mpg %>%
group_by(class) %>%
summarise(
mean = mean(hwy),
se = 1.96 * sd(hwy) / sqrt(n())
)
Another type of bar chart displays summaries
Slide 48
Slide 48 text
0
10
20
2seater compact midsize minivan pickup subcompact suv
class
mean
ggplot(class_mpg, aes(class, mean)) +
geom_bar(stat = "identity")
Slide 49
Slide 49 text
0
10
20
2seater compact midsize minivan pickup subcompact suv
class
mean
ggplot(class_mpg, aes(class, mean)) +
geom_col() # Thanks to Bob Rudis
Slide 50
Slide 50 text
●
●
●
●
●
●
●
20
24
28
2seater compact midsize minivan pickup subcompact suv
class
mean
Slide 51
Slide 51 text
●
●
●
●
●
●
●
15
20
25
30
2seater compact midsize minivan pickup subcompact suv
class
mean
Slide 52
Slide 52 text
8
ggplot2
extension
9 10
11
Slide 53
Slide 53 text
2.1.0 introduced a formal extension mechanism
https://www.ggplot2-exts.org, by Daniel Emaasit
Slide 54
Slide 54 text
ggraph, by Thomas Lin Pedersen
https://github.com/thomasp85/ggraph
Slide 55
Slide 55 text
ggseas by Peter Ellis
https://github.com/ellisp/ggseas
Uses X13-SEATS-ARIMA
in seasonal package
Slide 56
Slide 56 text
gganimate by David Robinson
https://github.com/dgrtwo/gganimate
Slide 57
Slide 57 text
Conclusion
Slide 58
Slide 58 text
1Labelling plots
Solved by Bob Rudis
A problem ignored for too long
Slide 59
Slide 59 text
2
Axes
Slide 60
Slide 60 text
Labelling
3
data
Slide 61
Slide 61 text
No content
Slide 62
Slide 62 text
5
Missing
values
Slide 63
Slide 63 text
6
Histograms
Slide 64
Slide 64 text
7
Bar
charts
Slide 65
Slide 65 text
8
ggplot2
extension
9 10
11
Slide 66
Slide 66 text
Many of the features I
discussed here have
been added in recent
versions of ggplot2.
See the release notes
for more detail.