Upgrade to Pro — share decks privately, control downloads, hide ads and more …

tidyverse tutorial 1

kur0cky
September 20, 2019

tidyverse tutorial 1

tidyverse 超入門
講義用

kur0cky

September 20, 2019
Tweet

More Decks by kur0cky

Other Decks in Programming

Transcript

  1. ຊ೔࢖༻͢Δσʔλ TUBSXBST w ελʔ΢Υʔζͷొ৔ਓ෺ʹؔ͢Δσʔλ IUUQTXBQJDP  qJHIUT w ೥ʹ-(" +',

    &83Λग़ൃͨ͢͠΂ͯͷϑϥΠτͷఆࠁσʔλ XFBUIFS w -(" +', &83ͷఱީ΍෩ͷ৘ใ ࣌ؒ͝ͱ  BJSMJOFT w ߤۭձࣾͷςʔϒϧ !3
  2. 5PPMT ޷͖ʹબ΂͹ྑ͍ w ޷Έ ྲྀߦΓ څྉ ਎ۙʹಘҙͳਓ͕͍Δ FUD !8 #JH%BUB

    ػցֶश ౷ܭղੳ ϋϯυϦϯά 42- ˕ ✕ ✕ ˓ 3 ˚ ˓ ˕ ˕ 1ZUIPO ˚ ˕ ˓ ˓ &YDFM ✕ ✕ ✕ ✕ ి୎ ✕ ✕ ✕ ✕
  3. 3 ಘҙͳͷͰ3Λ঺հ͠·͢ w ΠϯλϓϦλݴޠ 㲗ίϯύΠϥݴޠ  ௕ॴ w 044 w

    ༷ʑͳ౷ܭղੳɾػցֶशϥΠϒϥϦ w 34UVEJPࣾͷଘࡏ ୹ॴ w ஗͍ ಺෦͕$ ͳͲͰॻ͔Ε͍ͯΔͱ଎͍ !10
  4. ىಈ͠Α͏ w ىಈ w ഑෍ͨ͠&YFSDJTF31SPKΛ্ཱͪ͛Δ w 3Λಈ͔ͯ͠ΈΑ͏ w ίϯιʔϧͰplot(iris) w

    ύοέʔδΛΠϯετʔϧͯ͠ΈΑ͏ ͕͔͔࣌ؒΔͷͰ஫ҙ  • install.packages(“tidyverse”) w εΫϦϓτΛ࡞ͬͯΈΑ͏ w 'JMF/FX'JMF34DSJQU !13
  5. 3Ͱͷσʔλܕ Ұͭͷ஋ w -PHJDBM #PPMFBO  w *OUFHFS w %PVCMF

    w $PNQMFY w $IBSBDUFS w 'BDUPS w FUDʜ !18 ෳ਺ͷ஋ w "UPNJD7FDUPS w .BUSJY w %BUB'SBNF w -JTU w FUDʜ
  6. "UPNJD7FDUPS w ̍࣍ݩ഑ྻ w ཁૉ͸શͯಉ͡ܕ શͯJOUFHFS શͯDIBSBDUFS ͳͲ  w

    ཁૉͷશͯΛ·ͱΊͯॲཧͰ͖Δʢ࢛ଇԋࢉͱ͔ʣ w c()Ͱ࡞Δ !19
  7. "UPNJD7FDUPS !20 drink <- c(“beer”, “sake”, “whisky”) # ୅ೖ drink

    # ΦϒδΣΫτͷݺͼग़͠ price <- c(480, 700, 850) # ਺஋ܕϕΫτϧ favorite <- c(TRUE, TRUE, TRUE) # ࿦ཧܕϕΫτϧ
  8. %BUB'SBNF ͓ͳ͡Έͷ࢛͍֯ςʔϒϧ • ֤ྻ͸ಉ͡௕͞ͷ Atomic vector w data.frame()Ͱ࡞Δ w data.frame(drink

    = drink, 
 price = price, 
 favorite = favorite) w ༷ʑͳύοέʔδ͕%BUB'SBNFΛத৺ʹ࡞ΒΕ͍ͯΔ !23
  9. %BUB'SBNFʹ৮Ζ͏ starwars <- read.csv("data/starwars.csv", stringsAsFactors = FALSE, fileEncoding = “UTF-8”)

    head(starwars) # ઌ಄֬ೝ tail(starwars) # ຤ඌ֬ೝ summary(starwars) # هड़౷ܭྔ str(starwars) # ֤ྻͷܕ֬ೝ !24
  10. 5JEZWFSTF ֓೦ w 3Ͱͷ༷ʑͳૢ࡞ *NQPSU &YQPSU 5SBOTGPSN 7JTVBMJ[BUJPO FUD ͕


    ౷ҰతͳΠϯλʔϑΣʔεͰग़དྷͨΒ ૉఢͩΑͶ ύοέʔδ w ্هΛ࣮ݱ͢ΔͨΊͷύοέʔδ܈ w install.packages(“tidyverse”) ͰΠϯετʔϧ w )BEMFZ8JDLIBN 34UVEJPࣾ ͕த৺ͱͳΓ։ൃ w ௒ศར !27 ˞)BEMFZ8JDLIBN3քͷਆ
  11. %BUB'SBNFͷجຊૢ࡞ EQMZS w ม਺ ྻ ͷநग़ w ؍ଌ ߦ ͷநग़

    w ؍ଌ ߦ ͷฒͼସ͑ w ৽ͨͳม਺ ྻ ͷ࡞੒ w ूܭ w άϧʔϓԽ !31 • select() • filter() • arrange() • mutate() • summarise() • group_by()
  12. ΍ͬͯΈΑ͏ select(starwars, name, gender, species) filter(starwars, species == "Human", height

    <= 170) mutate(starwars, BMI = mass / (height/100)^2) arrange(starwars, gender, height) summarise(starwars, mean_mass = mean(mass, na.rm = TRUE), mean_height = mean(height, na.rm = TRUE)) grouped <- group_by(starwars, species) summarise(grouped, mean_mass = mean(mass, na.rm = TRUE), mean_height = mean(height, na.rm = TRUE), count = n()) !33
  13. %>%

  14. ύΠϓԋࢉࢠ%>% X %>% f X %>% f(y) X %>% f

    %>% g X %>% f(y, .) !35 f(X) f(X, y) g(f(X)) f(y, X) લͷؔ਺ͷग़ྗΛ࣍ͷؔ਺ͷୈҾ਺ʹΘͨ͢΋ͷ $NE 4IJGU N $USM 4IJGU N Ͱೖྗ
  15. ෳ਺ͷॲཧΛ͢Δ৔߹ df1 <- filter(starwars, species == "Human") df2 <- mutate(d1,

    BMI = mass / (height/100)^2) df3 <- group_by(df2, gender) df4 <- summarise(df3, mean_BMI = mean(BMI, na.rm=TRUE), min_BMI = min(BMI, na.rm=TRUE), max_BMI = max(BMI, na.rm=TRUE) !36 # A tibble: 2 x 4 gender mean_BMI min_BMI max_BMI <chr> <dbl> <dbl> <dbl> 1 female 22.0 16.5 27.5 2 male 26.0 21.5 37.9
  16. ෳ਺ͷॲཧΛ͢Δ৔߹ starwars %>% filter(species == "Human") %>% mutate(BMI = mass

    / (height/100)^2) %>% group_by(gender) %>% summarise(mean_BMI = mean(BMI, na.rm=TRUE), min_BMI = min(BMI, na.rm=TRUE), max_BMI = max(BMI, na.rm=TRUE)) %>%Λ࢖͏͜ͱͰ ୭ʹͰ΋ಡΈ΍͍͢ίʔυʹʂʂ !37
  17. ՝୊ 1. ൃۭߓ (origin) ͝ͱͷඈߦػศͷ਺, ඈߦڑ཭ͷฏۉ, ग़ൃ࣌ࠁ஗Ԇͷ ฏۉΛٻΊΑ 2. ೔෇͝ͱͷग़ൃ࣌ࠁ஗ԆͷฏۉΛٻΊΑ

    3. ೔෇͝ͱʹܽߤʹͳͬͨศͷ਺Λௐ΂Α
 ʢܽߤͩͱdep_delayͱarr_delay͕NAʹͳΔʣ 4. ࣮ࡍʹඈΜͩศͷඈߦڑ཭ͷඪ४ภࠩΛग़ൃۭߓ͝ͱʹٻΊ, ঢॱʹ ฒ΂Α 5. ೔෇͝ͱʹ, ۭߓLGA͔Β࠷ॳʹඈΜͩศͱ࠷ޙʹඈΜͩศΛநग़ͤΑ 6. ࣮ࡍʹඈΜͩศͷ͏ͪ, ఆࠁ௨Γग़ൃͨ͠ศͷׂ߹Λௐ΂Α ώϯτɿdplyrͷؔ਺ͷதͰ n() Λ࢖͏ͱߦ਺Λࢉग़Ͱ͖Δ !40