Upgrade to Pro — share decks privately, control downloads, hide ads and more …

NDS#51

82d6167c4d14393c2e20b37a74b363c5?s=47 kasacchiful
March 25, 2017
81

 NDS#51

NDS#51 の LT で発表した資料

82d6167c4d14393c2e20b37a74b363c5?s=128

kasacchiful

March 25, 2017
Tweet

Transcript

  1. RͱdplyrͱPipe @kasacchiful NDS#51 2017-03-25

  2. Who am I ? • ּݪ ޺ (@kasacchiful) • ৽ׁࢢࡏॅ,

    SIerۈ຿ • Ruby͕͓ؾʹೖΓ • ࠷ۙ͸Swiftͱ͔Rͱ͔JavaScriptʢϞόΠϧ༻్ʣ • JaSST৽ׁͷਓ
  3. એ఻

  4. JaSST’17 Niigata։࠵ • ೔࣌ɿ2017೥4݄28೔ʢۚʣ13:00 ʙ 16:40 • ৔ॴɿग࡝ϝοη தձٞࣨ 201B

    • ςʔϚɿʮϢʔβϏϦςΟ / UXʯ • جௐߨԋɿḺຊ ప໵ ࢯʮϢʔβΤΫεϖϦΤϯεͷཁૉͱϓϩηε——UX/UCD֓࿦ʯ • ࣄྫ঺հɿ༄ੜ େհ ࢯʮػೳத৺͔Βਓؒத৺΁ ʙ೔ཱιϦϡʔγϣϯζͷऔΓ૊Έʙʯ • ࢀՃඅɿ2,160ԁʢ੫ࠐʣ • ࢀՃਃࠐड෇தʢਃࠐظݶɿ4/14ʢۚʣ18:00ʣ • ৄࡉ͸ http://jasst.jp ʢʮJaSSTʯͰݕࡧʣ
  5. ࠓ೔ͷ͓࿩ • σʔλΛѻ͏ࡍʹ͸ɺσʔλΛ࢖͍΍͘͢੔ܗ͢ΔΑͶ • Rͩͱdplyr͕ศརͩͶʢσϑΝΫτελϯμʔυʁʣ • Pipe࢖͏ͱศརͩͶ

  6. ΈΜͳେ޷͖R • Rͷࡉ͔͍આ໌͸͠·ͤΜ • R͸ࡢ೥͔Β৮Γͩͨ͠ॳ৺ऀͰ͢

  7. σʔλ੔ܗ

  8. σʔλ෼ੳ͢Δલʹσʔλ੔ܗ • R Ͱσʔλ෼ੳ • σʔλ෼ੳ͢Δࡍʹ͸ɺ࣮ࡍͷσʔλΛͦͷ··ѻ͏͜ ͱ͸ͳ͍ ѻ͍΍͍͢Α͏ʹσʔλΛ੔ܗ͢Δ͸ͣ

  9. ྫ • ೔෇Λʮ೥ʯʮ݄ʯʮ೔ʯʹ෼ׂ • ϥϯΫ෇͚ • ιʔτ • άϧʔϓԽͯ͠ूܭͯ͠ΈΔ •

    σʔλΛJoinͨ͠ΓɺUnionͨ͠Γ
  10. ࠷ۙ࿩୊ʹͳͬͨ΍ͭ • ಺ֳ෎ͷࠃຽͷॕ೔csvσʔλ • ݱࡏ͸ɺϑΥʔϚοτ͕վળ͞Ε͍ͯ·͢ • ͜Ε΋੔ܗͯ͠࢖͓͏Ͷ

  11. dplyr • ߴ଎ʹσʔλϑϨʔϜΛѻ͏ϥΠϒϥϦ • pythonͳΒʮpandasʯ

  12. dplyrΠϯετʔϧ install.packages('dplyr') # Πϯετʔϧ library(dplyr) # ϥΠϒϥϦͷಡࠐ # ศརπʔϧΛҰׅΠϯετʔϧ install.packages('tidyverse')

    # ҰׅΠϯετʔϧ library(tidyverse) # Ұׅಡࠐ
  13. ҰׅΠϯετʔϧ͞ΕΔ ϥΠϒϥϦ • ggplot2: for data visualisation. • dplyr :

    for data manipulation. • tidyr: for data tidying. • readr: for data import. • purrr: for functional programming. • tibble: for tibbles, a modern re-imagining of data frames. ࢀর: https://github.com/tidyverse/tidyverse
  14. ྫ • ৽ׁࢢͷΦʔϓϯσʔλΛ࢖ͬͯΈΔ • ԰಺ආ೉ॴͱ԰֎ආ೉ॴͷGIS৘ใΛϚοϐϯά • ܽଛ஋ʹ஋Λิ׬͢ΔͨΊʹɺtidyr Λ࢖ͬͯ·͢

  15. library(ggplot2) library(ggmap) library(dplyr) library(tidyr) library(readr) niigata <- c(139.0255893, 37.8490611) data.indoor_shelter

    <- read_csv("od_gis_10161_okunaihinanjo.csv") data.outdoor_shelter <- read_csv("od_gis_10162_okugaihinanbasho.csv") # ԰֎ආ೉ॴσʔλ͸ΧϥϜ͕଍Γͳ͍ͷͰɺ௥Ճͯ͠ɺ԰಺ආ೉ॴσʔλʹunion # ʢ࣮ࡍ͸ΧϥϜ଍Γͳͯ͘΋union͸Մೳʣ indoor <- data.indoor_shelter %>% rename(lon = longitude, lat = latitude) # ΧϥϜ໊Λมߋ dplyr::rename outdoor <- data.outdoor_shelter %>% rename(lon = longitude, lat = latitude) %>% # ΧϥϜ໊Λมߋ mutate(SAFIELD007 = NA, SAFIELD008 = NA, SAFIELD009 = NA, SAFIELD010 = NA) # ྻΛ௥Ճ dplyr::mutate data.shelter <- dplyr::union_all(indoor, outdoor) # SAFIELD005ʢ஍۠ʣͷܽଛ஋ิ׬ data.shelter <- data.shelter %>% replace_na(list(SAFIELD005 = ‘ͦͷଞ')) # tidy::replace_na # ஍ਤඳը get_googlemap(niigata, zoom = 10, maptype = "roadmap", language = "ja-JP") %>% ggmap(extent = "device", darken = c(0.2, "black")) + geom_point(data = data.shelter, aes(color = SAFIELD005,shape = SAFIELD006)) + theme_bw(base_family = "HiraKakuProN-W3") + xlab("") + ylab("") + labs(color = "஍۠", shape = "԰಺/԰֎", title = "৽ׁࢢͷ԰಺/԰֎ආ೉ॴ") + guides(shape = guide_legend(order = 1), colour = guide_legend(order = 2)) + theme(axis.ticks = element_blank(), axis.text = element_blank()) https://github.com/kasacchiful/nds51-sample
  16. None
  17. dplyr͸ػೳ͕๛෋ • ࠔͬͨ࣌ͷνʔτγʔτ • https://www.rstudio.com/wp-content/uploads/ 2015/02/data-wrangling-cheatsheet.pdf • dplyr͚ͩͰͳ͘ɺtidyr౳Λ࢖ͬͯσʔλ੔ܗ

  18. Pipe

  19. library(ggplot2) library(ggmap) library(dplyr) library(tidyr) library(readr) niigata <- c(139.0255893, 37.8490611) data.indoor_shelter

    <- read_csv("od_gis_10161_okunaihinanjo.csv") data.outdoor_shelter <- read_csv("od_gis_10162_okugaihinanbasho.csv") # ԰֎ආ೉ॴσʔλ͸ΧϥϜ͕଍Γͳ͍ͷͰɺ௥Ճͯ͠ɺ԰಺ආ೉ॴσʔλʹunion # ʢ࣮ࡍ͸ΧϥϜ଍Γͳͯ͘΋union͸Մೳʣ indoor <- data.indoor_shelter %>% rename(lon = longitude, lat = latitude) # ΧϥϜ໊Λมߋ outdoor <- data.outdoor_shelter %>% rename(lon = longitude, lat = latitude) %>% # ΧϥϜ໊Λมߋ mutate(SAFIELD007 = NA, SAFIELD008 = NA, SAFIELD009 = NA, SAFIELD010 = NA) # ྻΛ௥Ճ data.shelter <- dplyr::union_all(indoor, outdoor) # SAFIELD005ʢ஍۠ʣͷܽଛ஋ิ׬ data.shelter <- data.shelter %>% replace_na(list(SAFIELD005 = 'ͦͷଞ')) # ஍ਤඳը get_googlemap(niigata, zoom = 10, maptype = "roadmap", language = "ja-JP") %>% ggmap(extent = "device", darken = c(0.2, "black")) + geom_point(data = data.shelter, aes(color = SAFIELD005,shape = SAFIELD006)) + theme_bw(base_family = "HiraKakuProN-W3") + xlab("") + ylab("") + labs(color = "஍۠", shape = "԰಺/԰֎", title = "৽ׁࢢͷ԰಺/԰֎ආ೉ॴ") + guides(shape = guide_legend(order = 1), colour = guide_legend(order = 2)) + theme(axis.ticks = element_blank(), axis.text = element_blank())
  20. indoor <- data.indoor_shelter %>% rename(lon = longitude, lat = latitude)

    outdoor <- data.outdoor_shelter %>% rename(lon = longitude, lat = latitude) %>% mutate(SAFIELD007 = NA, SAFIELD008 = NA, SAFIELD009 = NA, SAFIELD010 = NA) data.shelter <- dplyr::union_all(indoor, outdoor)
  21. Pipe • %>% (dplyr::%>%) • ࠨลͷ஋Λӈลͷؔ਺ͷୈҰҾ਺ʹ • x %>% f

    #=> f(x) • x %>% f(y) #=> f(x, y) • x %>% f %>% g %>% h #=> h(g(f(x)))
  22. Pipe data.library2014 <- read.csv("2014toshokanriyo.csv") head(rename(select(data.library2014, ਤॻ໊ؗ, ։ؗ೔਺.೔.), name=ਤॻ໊ؗ, open_days=։ؗ೔਺.೔.), 5)

    ҎԼͱ౳Ձ data.library2014 <- read.csv("2014toshokanriyo.csv") data.library2014 %>% select(ਤॻ໊ؗ, ։ؗ೔਺.೔.) %>% rename(name=ਤॻ໊ؗ, open_days=։ؗ೔਺.೔.) %>% head(5) ϝιουͷωετ͸ݟͮΒ͍ɻ pipe࢖͑͹ɺՄಡੑ্͕Δɻॻ͖΍͍͢ɻ
  23. ݩ͸magrittr • https://github.com/tidyverse/magrittr • R package to bring forward-piping features

    ala F#'s |> operator. Ceci n'est pas un pipe. • F#ͷ |> ԋࢉࢠͷػೳ͕༝དྷ
  24. F# ͷpipe ࢀর: https://msdn.microsoft.com/ja-jp/library/dd233229(v=vs.120).aspx#ؔ਺߹੒ͱύΠϓϥΠϯॲཧ

  25. ଞݴޠͷpipe • Elixir • Julia • ԋࢉࢠͱߴ֊ؔ਺͕ఆٛͰ͖ΔݴޠͰ͋Ε͹ɺಠ࣮ࣗ૷ Ͱ͖Δ͔΋͠Ε·ͤΜɻ ࢀর: http://elixir-lang.org/getting-started/enumerables-and-streams.html#the-pipe-operator

    ࢀর: http://docs.julialang.org/en/stable/stdlib/base/#Base.|>
  26. ଞʹ΋Pipe.R͕͋ΔΑ • %>>% • Pipe() • pipeline() • dplyrͱڞଘՄೳɻ

  27. ## magrittr system.time({ lapply(1:100000, function(i) { sample(letters,6,replace = T) %>%

    paste(collapse = "") %>% "=="("rstats") }) }) Ϣʔβ γεςϜ ܦա 13.495 0.064 13.807 ## Pipe.R system.time({ lapply(1:100000, function(i) { sample(letters,6,replace = T) %>>% paste(collapse = "") %>>% "=="("rstats") }) }) Ϣʔβ γεςϜ ܦա 4.922 0.030 5.015 ࢀর: https://renkun.me/blog/2014/08/08/difference-between-magrittr-and-pipeR.html
  28. ·ͱΊ • σʔλ෼ੳ͢ΔલʹɺdplyrΛ࢖ͬͯσʔλ੔ܗ͕େࣄ • RͰPipe࢖͏ͱɺϝιουνΣΠϯ෩ʹॻ͚ͯศར • ଞݴޠͰ΋Pipe͕࣮૷͞Ε͍ͯΔ΋ͷ͕͋ΔͷͰɺ஌ͬ ͓ͯ͘ͱ޾ͤʹͳΕΔ͔΋