Slide 1

Slide 1 text

BeginneR Session - データの可視化 - #82 Tokyo.R 2019.10.25 @kilometer00

Slide 2

Slide 2 text

Who!?

Slide 3

Slide 3 text

Who!? 名前: 三村 @kilometer 職業: ポスドク (こうがくはくし) 専⾨: ⾏動神経科学(霊⻑類) 脳イメージング 医療システム⼯学 R歴: ~ 10年ぐらい 流⾏: 時差ぼけ

Slide 4

Slide 4 text

2019.01.19 Tokyo.R #75 BeginneR Session – Data pipeline 2019.03.02 Tokyo.R #76 BeginneR Session – Data pipeline 2019.04.13 Tokyo.R #77 BeginneR Session – Data analysis 2019.05.25 Tokyo.R #78 BeginneR Session – Data analysis 2019.06.29 Tokyo.R #79 BeginneR Session – 確率の基礎 2019.07.27 Tokyo.R #80 R Interface to Python 2019.09.29 Tokyo.R #81 IntRoduction & DemonstRation

Slide 5

Slide 5 text

Before After BeginneR Session BeginneR BeginneR

Slide 6

Slide 6 text

BeginneR Advanced Hoxo_m If I have seen further it is by standing on the shoulders of Giants. -- Sir Isaac Newton, 1676

Slide 7

Slide 7 text

BeginneR Session - データの可視化 -

Slide 8

Slide 8 text

Cave art Hieroglyphs

Slide 9

Slide 9 text

Words Figures

Slide 10

Slide 10 text

Words Beyond words Figures =

Slide 11

Slide 11 text

Text Image Information Intention Data decode encode feedback

Slide 12

Slide 12 text

Text Image First, A. Next, B. Then C. Finally D. time Intention encode "Frozen" structure A B C D time value α β

Slide 13

Slide 13 text

https://en.wikipedia.org/wiki/Frank_Anscombe

Slide 14

Slide 14 text

Francis Anscombe https://en.wikipedia.org/wiki/Frank_Anscombe

Slide 15

Slide 15 text

"Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1), 1973

Slide 16

Slide 16 text

"Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1), 1973 Most text books on statistical methods, and most statistical computer programs, pay too little attention to graphs.

Slide 17

Slide 17 text

"Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1), 1973 Good statistical analysis ... should be sensitive both to peculiar features in given numbers and also whatever background information is available about the variables. 特異的な 変数 特徴量

Slide 18

Slide 18 text

No content

Slide 19

Slide 19 text

No content

Slide 20

Slide 20 text

x y ?

Slide 21

Slide 21 text

No content

Slide 22

Slide 22 text

Wide Long

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

library(tidyverse) anscombe %>% rowid_to_column("obs") %>% gather(key, val, -obs) %>% separate(key, into = c("xy", "No"), sep = 1L) %>% spread(xy, val) %>% select(No, obs, x, y) %>% arrange(No) -> dat

Slide 25

Slide 25 text

Spark joy!!

Slide 26

Slide 26 text

No content

Slide 27

Slide 27 text

dat %>% group_nest(No) %>% mutate(mean_x = map_dbl(data, ~mean(.$x)), mean_y = map_dbl(data, ~mean(.$y)), sd_x = map_dbl(data, ~sd(.$x)), sd_y = map_dbl(data, ~sd(.$y))) %>% mutate(model_lm = map(data, ~lm(y ~ x, data = .)), rsq = map_dbl(model_lm, ~summary(.) %>% .$r.sq), cor = map_dbl(data, ~cor(.$x, .$y)))

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

No content

Slide 30

Slide 30 text

x y mapping g <- ggplot(data = dat, mapping = aes(x = x, y = y)) data

Slide 31

Slide 31 text

g <- g+ geom_smooth(method = "lm", se = F)+ geom_point() g <- ggplot(data = dat, aes(x = x, y = y)) 2. Add HFPN@MBZFST 1. Create HHQMPU object with NBQQJOH g <- g+ facet_wrap(facets = ~ No, ncol = 4) 3. Set options

Slide 32

Slide 32 text

g <- ggplot(data = dat, aes(x = x, y = y))+ geom_smooth(method = "lm", se = F)+ geom_point()+ facet_wrap(facets = ~ No, ncol = 4)+ theme_bw() ggsave("fig.png", g)

Slide 33

Slide 33 text

Anscombe’s quartet

Slide 34

Slide 34 text

install.packages("datasauRus") https://github.com/lockedata/datasauRus Download the Datasaurus: Never trust summary statistics alone; always visualize your data http://www.thefunctionalart.com/2016/08/download-datasaurus-never-trust-summary.html

Slide 35

Slide 35 text

"Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1), 1973 Good statistical analysis ... should be sensitive both to peculiar features in given numbers and also whatever background information is available about the variables. 特異的な 変数 特徴量

Slide 36

Slide 36 text

Summary...

Slide 37

Slide 37 text

Words Beyond words Figures =

Slide 38

Slide 38 text

Text Image Information Intention Data decode encode feedback

Slide 39

Slide 39 text

Text Image First, A. Next, B. Then C. Finally D. time Intention encode "Frozen" structure A B C D time value α β

Slide 40

Slide 40 text

x y ? mapping data

Slide 41

Slide 41 text

Wide Long

Slide 42

Slide 42 text

No content

Slide 43

Slide 43 text

x y mapping g <- ggplot(data = dat, mapping = aes(x = x, y = y)) data

Slide 44

Slide 44 text

g <- g+ geom_smooth(method = "lm", se = F)+ geom_point() g <- ggplot(data = dat, aes(x = x, y = y)) 2. Add HFPN@MBZFST 1. Create HHQMPU object with NBQQJOH g <- g+ facet_wrap(facets = ~ No, ncol = 4) 3. Set options

Slide 45

Slide 45 text

g <- ggplot(data = dat, aes(x = x, y = y))+ geom_smooth(method = "lm", se = F)+ geom_point()+ facet_wrap(facets = ~ No, ncol = 4)+ theme_bw() ggsave("fig.png", g)

Slide 46

Slide 46 text

Anscombe’s quartet

Slide 47

Slide 47 text

"Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1), 1973 Unfortunately, most persons who have resources to a computer for statistical analysis of data are not much interested either in computer programming or in statistical method

Slide 48

Slide 48 text

"Graphs in Statistical Analysis" Anscombe, F.J. American Statistician 27 (1), 1973 Unfortunately, most persons who have resources to a computer for statistical analysis of data are not much interested either in computer programming or in statistical method ... It's time that was changed.

Slide 49

Slide 49 text

Enjoy!!

Slide 50

Slide 50 text

bar dradra