$30 off During Our Annual Pro Sale. View Details »

tidyverse tutorial 1

kur0cky
September 20, 2019

tidyverse tutorial 1

tidyverse 超入門
講義用

kur0cky

September 20, 2019
Tweet

More Decks by kur0cky

Other Decks in Programming

Transcript

  1. σʔλղੳͱલॲཧ
    .ࠇ໦༟ୋ
    !FEUVTBDKQ

    View Slide

  2. ໨࣍
    σʔλղੳͱ͸
    3ͱ34UVEJP
    3ͷجຊ
    ϞμϯͳσʔλϑϨʔϜૢ࡞
    !2

    View Slide

  3. ຊ೔࢖༻͢Δσʔλ
    TUBSXBST
    w ελʔ΢Υʔζͷొ৔ਓ෺ʹؔ͢Δσʔλ IUUQTXBQJDP

    qJHIUT
    w ೥ʹ-(" +', &83Λग़ൃͨ͢͠΂ͯͷϑϥΠτͷఆࠁσʔλ
    XFBUIFS
    w -(" +', &83ͷఱީ΍෩ͷ৘ใ ࣌ؒ͝ͱ

    BJSMJOFT
    w ߤۭձࣾͷςʔϒϧ
    !3

    View Slide

  4. σʔλղੳͱ͸

    View Slide

  5. σʔλղੳͱ͸
    !5
    6OEFSTUBOE
    *NQPSU
    0VUQVU4IBSF

    View Slide

  6. σʔλղੳͱ͸
    !6
    *NQPSU
    0VUQVU4IBSF
    7JTVBMJ[F
    .PEFMMJOH
    *OUFSQSFU
    5SBOTGPSN

    View Slide

  7. 5PPMT
    ͨ͘͞Μ͋Δ
    w ి୎
    w .JDSPTPGU&YDFM
    w *#.4144
    w 4"4
    w 1ZUIPO
    w 3
    !7

    View Slide

  8. 5PPMT
    ޷͖ʹબ΂͹ྑ͍
    w ޷Έ ྲྀߦΓ څྉ ਎ۙʹಘҙͳਓ͕͍Δ FUD
    !8
    #JH%BUB ػցֶश ౷ܭղੳ ϋϯυϦϯά
    42- ˕ ✕ ✕ ˓
    3 ˚ ˓ ˕ ˕
    1ZUIPO ˚ ˕ ˓ ˓
    &YDFM ✕ ✕ ✕ ✕
    ి୎ ✕ ✕ ✕ ✕

    View Slide

  9. 3ͱ34UVEJP

    View Slide

  10. 3
    ಘҙͳͷͰ3Λ঺հ͠·͢
    w ΠϯλϓϦλݴޠ 㲗ίϯύΠϥݴޠ

    ௕ॴ
    w 044
    w ༷ʑͳ౷ܭղੳɾػցֶशϥΠϒϥϦ
    w 34UVEJPࣾͷଘࡏ
    ୹ॴ
    w ஗͍ ಺෦͕$ͳͲͰॻ͔Ε͍ͯΔͱ଎͍

    !10

    View Slide

  11. 34UVEJPΛ࢖͓͏
    w 3ʹಛԽͨ͠౷߹։ൃ؀ڥ *%&

    w ΤσΟλͱͯ͠༏लͳ͚ͩͰͳ͘ ༷ʑͳ֦ுػೳ
    w ๛෋ͳγϣʔτΧοτ
    w (JUͱͷ࿈ܞ
    w ແྉ
    !11

    View Slide

  12. 34UVEJPͷը໘
    ΤσΟλ
    ίϯιʔϧ
    ΦϒδΣΫτ ཤྺͳͲ
    ϓϩοτ ϔϧϓͳͲ
    !12




    View Slide

  13. ىಈ͠Α͏
    w ىಈ
    w ഑෍ͨ͠&YFSDJTF31SPKΛ্ཱͪ͛Δ
    w 3Λಈ͔ͯ͠ΈΑ͏
    w ίϯιʔϧͰplot(iris)
    w ύοέʔδΛΠϯετʔϧͯ͠ΈΑ͏ ͕͔͔࣌ؒΔͷͰ஫ҙ

    • install.packages(“tidyverse”)
    w εΫϦϓτΛ࡞ͬͯΈΑ͏
    w 'JMF/FX'JMF34DSJQU
    !13

    View Slide

  14. 31SPKFDUͱ͸
    w ϓϩδΣΫτ୯ҐͰ෼ੳΛ؅ཧ͢Δ͘͠Έ
    w Ұ࿈ͷղੳͰඞཁʹͳΔϑΝΠϧ܈Λ·ͱΊͯѻ͏
    w ࡞ۀσΟϨΫτϦ͕ͦ͜ʹͳΔ
    w ଞਓͱͷڞ༗͕͠΍͍͢
    w 'JMF/FX1SPKFDUͰ࡞੒
    !14

    View Slide

  15. ؀ڥΛઃఆ͠Α͏
    w 5PPMT(MPCBM0QUJPOT͔Β؀ڥઃఆ
    w (FOFSBMɿνΣοΫΛશͯ֎ͦ͏
    w 3FTUPSFQSFWJPVTMZPQFOTPVSDFEPDVNFOUBUTUBSUVQ

    ͸࢒͓͍ͯͯ͠ྑ͍
    w $PEF4BWJOHɿΤϯίʔσΟϯάΛ65'ʹ
    w "QQFBSBODFɿ޷͖ͳݟͨ໨ʹઃఆ͠Α͏
    !15

    View Slide

  16. 3ͷجຊ

    View Slide

  17. جຊ
    w GPSจ JGจ ؔ਺ ͦͷଞʜ
    w ଞͷݴޠͱ͋·ΓมΘΒͳ͍ͷͰ ͻͭΑ͏ʹͳͬͨΒௐ΂
    ͍ͯͩ͘͞
    w ഑ྻͷࢀর͸͸͡·Γʂʂ
    !17

    View Slide

  18. 3Ͱͷσʔλܕ
    Ұͭͷ஋
    w -PHJDBM #PPMFBO

    w *OUFHFS
    w %PVCMF
    w $PNQMFY
    w $IBSBDUFS
    w 'BDUPS
    w FUDʜ
    !18
    ෳ਺ͷ஋
    w "UPNJD7FDUPS
    w .BUSJY
    w %BUB'SBNF
    w -JTU
    w FUDʜ

    View Slide

  19. "UPNJD7FDUPS
    w ̍࣍ݩ഑ྻ
    w ཁૉ͸શͯಉ͡ܕ શͯJOUFHFS શͯDIBSBDUFS ͳͲ

    w ཁૉͷશͯΛ·ͱΊͯॲཧͰ͖Δʢ࢛ଇԋࢉͱ͔ʣ
    w c()Ͱ࡞Δ
    !19

    View Slide

  20. "UPNJD7FDUPS
    !20
    drink <- c(“beer”, “sake”, “whisky”) # ୅ೖ
    drink # ΦϒδΣΫτͷݺͼग़͠
    price <- c(480, 700, 850) # ਺஋ܕϕΫτϧ
    favorite <- c(TRUE, TRUE, TRUE) # ࿦ཧܕϕΫτϧ

    View Slide

  21. .BUSJY
    w ̎࣍ݩ഑ྻ
    w ཁૉ͸શͯಉ͡ܕ
    w ߦྻԋࢉ͕Ͱ͖Δʢ಺ੵͱ͔ʣ
    w ཁૉͷશͯΛ·ͱΊͯॲཧͰ͖Δʢ࢛ଇԋࢉͱ͔ʣ
    w matrix()Ͱ࡞Δ
    !21

    View Slide

  22. -JTU
    w ̍࣍ݩ഑ྻ
    w ཁૉ͸ͳΜͰ΋͍͍ 7FDUPSͷ֦ு

    w list()Ͱ࡞Δ
    w ࣗ༝౓͕ΊͬͪΌߴ͍
    !22

    View Slide

  23. %BUB'SBNF
    ͓ͳ͡Έͷ࢛͍֯ςʔϒϧ
    • ֤ྻ͸ಉ͡௕͞ͷ Atomic vector
    w data.frame()Ͱ࡞Δ
    w data.frame(drink = drink, 

    price = price, 

    favorite = favorite)
    w ༷ʑͳύοέʔδ͕%BUB'SBNFΛத৺ʹ࡞ΒΕ͍ͯΔ
    !23

    View Slide

  24. %BUB'SBNFʹ৮Ζ͏
    starwars <- read.csv("data/starwars.csv",
    stringsAsFactors = FALSE,
    fileEncoding = “UTF-8”)
    head(starwars) # ઌ಄֬ೝ
    tail(starwars) # ຤ඌ֬ೝ
    summary(starwars) # هड़౷ܭྔ
    str(starwars) # ֤ྻͷܕ֬ೝ
    !24

    View Slide

  25. ϞμϯͳσʔλϑϨʔϜૢ࡞

    View Slide

  26. ʮ5PPMʹਫ਼௨͢Δʯͱ͍͏͜ͱ
    ྉཧʹྫ͑Δͱ
    w แஸ΍ίϯϩΛ࢖͑ΔΑ͏ʹͳΖ͏
    w Ϩγϐ͸దٓݟΕ͹Α͍
    ʮര଎ͰσʔλΛѻ͑Δʯͱ͍͏͜ͱ
    w ๲େͳࢼߦࡨޡΛ܁ΓฦͤΔ
    w ࣌ؒ͸༗ݶ
    ຊ࣭తͰͳ͍࡞ۀ͸ͬ͞͞ͱऴΘΒͤͯ

    ҿΈʹग़͔͚Α͏ݚڀ͠Α͏
    !26

    View Slide

  27. 5JEZWFSTF
    ֓೦
    w 3Ͱͷ༷ʑͳૢ࡞ *NQPSU &YQPSU 5SBOTGPSN 7JTVBMJ[BUJPO FUD
    ͕

    ౷ҰతͳΠϯλʔϑΣʔεͰग़དྷͨΒ ૉఢͩΑͶ
    ύοέʔδ
    w ্هΛ࣮ݱ͢ΔͨΊͷύοέʔδ܈
    w install.packages(“tidyverse”) ͰΠϯετʔϧ
    w )BEMFZ8JDLIBN 34UVEJPࣾ
    ͕த৺ͱͳΓ։ൃ
    w ௒ศར
    !27
    ˞)BEMFZ8JDLIBN3քͷਆ

    View Slide

  28. 5JEZWFSTF
    !28

    View Slide

  29. 5JEZWFSTF
    !29
    ಛʹ͜ΕΒ

    View Slide

  30. library(tidyverse)

    View Slide

  31. %BUB'SBNFͷجຊૢ࡞ EQMZS

    w ม਺ ྻ
    ͷநग़
    w ؍ଌ ߦ
    ͷநग़
    w ؍ଌ ߦ
    ͷฒͼସ͑
    w ৽ͨͳม਺ ྻ
    ͷ࡞੒
    w ूܭ
    w άϧʔϓԽ
    !31
    • select()
    • filter()
    • arrange()
    • mutate()
    • summarise()
    • group_by()

    View Slide

  32. ࢖͍ํ
    w ୈҾ਺ʹ͸σʔλϑϨʔϜΛ༩͑Δ
    w ୈҾ਺Ҏ߱Ͱ͸ྻ໊ΛΫΦʔςʔγϣϯແ͠Ͱ༩͑Δ
    w ໭Γ஋͸৽ͨͳσʔλϑϨʔϜ
    !32

    View Slide

  33. ΍ͬͯΈΑ͏
    select(starwars, name, gender, species)
    filter(starwars, species == "Human", height <= 170)
    mutate(starwars, BMI = mass / (height/100)^2)
    arrange(starwars, gender, height)
    summarise(starwars,
    mean_mass = mean(mass, na.rm = TRUE),
    mean_height = mean(height, na.rm = TRUE))
    grouped <- group_by(starwars, species)
    summarise(grouped,
    mean_mass = mean(mass, na.rm = TRUE),
    mean_height = mean(height, na.rm = TRUE),
    count = n())
    !33

    View Slide

  34. %>%

    View Slide

  35. ύΠϓԋࢉࢠ%>%
    X %>% f
    X %>% f(y)
    X %>% f %>% g
    X %>% f(y, .)
    !35
    f(X)
    f(X, y)
    g(f(X))
    f(y, X)
    લͷؔ਺ͷग़ྗΛ࣍ͷؔ਺ͷୈҾ਺ʹΘͨ͢΋ͷ
    $NE4IJGUN $USM4IJGUN
    Ͱೖྗ

    View Slide

  36. ෳ਺ͷॲཧΛ͢Δ৔߹
    df1 <- filter(starwars, species == "Human")
    df2 <- mutate(d1, BMI = mass / (height/100)^2)
    df3 <- group_by(df2, gender)
    df4 <- summarise(df3,
    mean_BMI = mean(BMI, na.rm=TRUE),
    min_BMI = min(BMI, na.rm=TRUE),
    max_BMI = max(BMI, na.rm=TRUE)
    !36
    # A tibble: 2 x 4
    gender mean_BMI min_BMI max_BMI

    1 female 22.0 16.5 27.5
    2 male 26.0 21.5 37.9

    View Slide

  37. ෳ਺ͷॲཧΛ͢Δ৔߹
    starwars %>%
    filter(species == "Human") %>%
    mutate(BMI = mass / (height/100)^2) %>%
    group_by(gender) %>%
    summarise(mean_BMI = mean(BMI, na.rm=TRUE),
    min_BMI = min(BMI, na.rm=TRUE),
    max_BMI = max(BMI, na.rm=TRUE))
    %>%Λ࢖͏͜ͱͰ ୭ʹͰ΋ಡΈ΍͍͢ίʔυʹʂʂ
    !37

    View Slide

  38. ࿅श໰୊
    ਎௕ͷ࠷΋௿͍உੑΛ֬ೝͤΑ
    #.*ͷ΋ͬͱ΋ߴ͍ొ৔ਓ෺͸୭͔
    ฏۉ਎௕͕࠷΋ߴ͍छ଒͸Կ͔

    ݕࡧͯͦ͠ͷ࢟Λ͔֬ΊΑ͏

    !38

    View Slide

  39. ࣍ճ·Ͱͷ՝୊

    View Slide

  40. ՝୊
    1. ൃۭߓ (origin) ͝ͱͷඈߦػศͷ਺, ඈߦڑ཭ͷฏۉ, ग़ൃ࣌ࠁ஗Ԇͷ
    ฏۉΛٻΊΑ

    2. ೔෇͝ͱͷग़ൃ࣌ࠁ஗ԆͷฏۉΛٻΊΑ

    3. ೔෇͝ͱʹܽߤʹͳͬͨศͷ਺Λௐ΂Α

    ʢܽߤͩͱdep_delayͱarr_delay͕NAʹͳΔʣ

    4. ࣮ࡍʹඈΜͩศͷඈߦڑ཭ͷඪ४ภࠩΛग़ൃۭߓ͝ͱʹٻΊ, ঢॱʹ
    ฒ΂Α

    5. ೔෇͝ͱʹ, ۭߓLGA͔Β࠷ॳʹඈΜͩศͱ࠷ޙʹඈΜͩศΛநग़ͤΑ

    6. ࣮ࡍʹඈΜͩศͷ͏ͪ, ఆࠁ௨Γग़ൃͨ͠ศͷׂ߹Λௐ΂Α

    ώϯτɿdplyrͷؔ਺ͷதͰ n() Λ࢖͏ͱߦ਺Λࢉग़Ͱ͖Δ
    !40

    View Slide