Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tokyo.R#82 Data visualization

kilometer
October 26, 2019

Tokyo.R#82 Data visualization

第82回Tokyo.Rでトークしたスライドです。

kilometer

October 26, 2019
Tweet

More Decks by kilometer

Other Decks in Programming

Transcript

  1. 2019.01.19 Tokyo.R #75 BeginneR Session – Data pipeline 2019.03.02 Tokyo.R

    #76 BeginneR Session – Data pipeline 2019.04.13 Tokyo.R #77 BeginneR Session – Data analysis 2019.05.25 Tokyo.R #78 BeginneR Session – Data analysis 2019.06.29 Tokyo.R #79 BeginneR Session – 確率の基礎 2019.07.27 Tokyo.R #80 R Interface to Python 2019.09.29 Tokyo.R #81 IntRoduction & DemonstRation
  2. BeginneR Advanced Hoxo_m If I have seen further it is

    by standing on the shoulders of Giants. -- Sir Isaac Newton, 1676
  3. Text Image First, A. Next, B. Then C. Finally D.

    time Intention encode "Frozen" structure A B C D time value α β
  4. "Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1),

    1973 Most text books on statistical methods, and most statistical computer programs, pay too little attention to graphs.
  5. "Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1),

    1973 Good statistical analysis ... should be sensitive both to peculiar features in given numbers and also whatever background information is available about the variables. 特異的な 変数 特徴量
  6. library(tidyverse) anscombe %>% rowid_to_column("obs") %>% gather(key, val, -obs) %>% separate(key,

    into = c("xy", "No"), sep = 1L) %>% spread(xy, val) %>% select(No, obs, x, y) %>% arrange(No) -> dat
  7. dat %>% group_nest(No) %>% mutate(mean_x = map_dbl(data, ~mean(.$x)), mean_y =

    map_dbl(data, ~mean(.$y)), sd_x = map_dbl(data, ~sd(.$x)), sd_y = map_dbl(data, ~sd(.$y))) %>% mutate(model_lm = map(data, ~lm(y ~ x, data = .)), rsq = map_dbl(model_lm, ~summary(.) %>% .$r.sq), cor = map_dbl(data, ~cor(.$x, .$y)))
  8. g <- g+ geom_smooth(method = "lm", se = F)+ geom_point()

    g <- ggplot(data = dat, aes(x = x, y = y)) 2. Add HFPN@ MBZFST 1. Create HHQMPU object with NBQQJOH g <- g+ facet_wrap(facets = ~ No, ncol = 4) 3. Set options
  9. g <- ggplot(data = dat, aes(x = x, y =

    y))+ geom_smooth(method = "lm", se = F)+ geom_point()+ facet_wrap(facets = ~ No, ncol = 4)+ theme_bw() ggsave("fig.png", g)
  10. install.packages("datasauRus") https://github.com/lockedata/datasauRus Download the Datasaurus: Never trust summary statistics alone;

    always visualize your data http://www.thefunctionalart.com/2016/08/download-datasaurus-never-trust-summary.html
  11. "Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1),

    1973 Good statistical analysis ... should be sensitive both to peculiar features in given numbers and also whatever background information is available about the variables. 特異的な 変数 特徴量
  12. Text Image First, A. Next, B. Then C. Finally D.

    time Intention encode "Frozen" structure A B C D time value α β
  13. g <- g+ geom_smooth(method = "lm", se = F)+ geom_point()

    g <- ggplot(data = dat, aes(x = x, y = y)) 2. Add HFPN@ MBZFST 1. Create HHQMPU object with NBQQJOH g <- g+ facet_wrap(facets = ~ No, ncol = 4) 3. Set options
  14. g <- ggplot(data = dat, aes(x = x, y =

    y))+ geom_smooth(method = "lm", se = F)+ geom_point()+ facet_wrap(facets = ~ No, ncol = 4)+ theme_bw() ggsave("fig.png", g)
  15. "Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1),

    1973 Unfortunately, most persons who have resources to a computer for statistical analysis of data are not much interested either in computer programming or in statistical method
  16. "Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1),

    1973 Unfortunately, most persons who have resources to a computer for statistical analysis of data are not much interested either in computer programming or in statistical method ... It's time that was changed.