Slide 1

Slide 1 text

R Interface to Python Tokyo.R #80 2019.07.27 kilometer00

Slide 2

Slide 2 text


Slide 3

Slide 3 text

Who!? 名前: 三村 @kilometer 職業: ポスドク (こうがくはくし) 専⾨: ⾏動神経科学(霊⻑類) 脳イメージング 医療システム⼯学 R歴: ~ 10年ぐらい 流⾏: カメ

Slide 4

Slide 4 text

2018.07.15 Tokyo.R #71 Landscape with R – the Japanese R community 2018.10.20 Tokyo.R #73 BeginneR Session – Visualization & Plot 2019.01.19 Tokyo.R #75 BeginneR Session – Data pipeline 2019.03.02 Tokyo.R #76 BeginneR Session – Data pipeline 2019.04.13 Tokyo.R #77 BeginneR Session – Data analysis 2019.05.25 Tokyo.R #78 BeginneR Session – Data analysis 2019.06.29 Tokyo.R #79 BeginneR Session – 確率の基礎

Slide 5

Slide 5 text

Before After BeginneR Session BeginneR BeginneR

Slide 6

Slide 6 text

BeginneR Advanced Hoxo_m If I have seen further it is by standing on the shoulders of Giants. -- Sir Isaac Newton, 1676

Slide 7

Slide 7 text

R Interface to Python

Slide 8

Slide 8 text

を、始めたいんだけど。 私もそろそろ ぱそこん できるんでしょ?教えてよ。 ぱそこん いいけど、 で何がしたいの? ぱそこん って何でもできるんでしょ? でも、どうしたらいいか分かんないんだ。 ⼀昔前

Slide 9

Slide 9 text

を、始めたいんだけど。 私もそろそろ じんこう ちのー できるんでしょ?教えてよ。 いいけど、 で何がしたいの? って何でもできるんでしょ? でも、どうしたらいいか分かんないんだ。 最近 じんこう ちのー じんこう ちのー

Slide 10

Slide 10 text

を、始めたいんだけど。 私もそろそろ できるんでしょ?教えてよ。 いいけど、 で何がしたいの? って何でもできるんでしょ? でも、どうしたらいいか分かんないんだ。

Slide 11

Slide 11 text

を、始めたいんだけど。 私もそろそろ できるんでしょ?教えてよ。 いいけど、 で何がしたいの? って何でもできるんでしょ? でも、どうしたらいいか分かんないんだ。

Slide 12

Slide 12 text

Input Output Do NOT start from here

Slide 13

Slide 13 text

Input Output Do NOT start from here Whatever …

Slide 14

Slide 14 text

Input Output Do NOT start from here Whatever … 1 2 3

Slide 15

Slide 15 text

Input Output

Slide 16

Slide 16 text

Integrated Development Environment RStudio

Slide 17

Slide 17 text

Integrated Development Environment RStudio

Slide 18

Slide 18 text


Slide 19

Slide 19 text

Projects RStudio

Slide 20

Slide 20 text

File > New Project… > New Directory > New Project hogehoge

Slide 21

Slide 21 text

hogehoge ~/Documents/R hogehoge.Rproj .Rproj.user Project Root Directory Double click!! .RData .Rhistory Auto saved project information Open project New!!

Slide 22

Slide 22 text

~/Documents/R project1 project2 project3 project4

Slide 23

Slide 23 text

No content

Slide 24

Slide 24 text

No content

Slide 25

Slide 25 text

var = 1 Variable assignment var <- 1 1 -> var var = 1

Slide 26

Slide 26 text

Variable naming 1var = 1 _var = 1 list = 1 var.1 = 1 .var = 1 _var <- 1 1var <- 1 var.1 <- 1 list <- 1 .var <- 1

Slide 27

Slide 27 text

Variable naming 1var = 1 _var = 1 list = 1 var.1 = 1 .var = 1 (reserved) _var <- 1 1var <- 1 var.1 <- 1 list <- 1 .var <- 1 (reserved)

Slide 28

Slide 28 text

Variable type var = "1" String Float var = 1.0 Integer var = 1 var <- "1" Character Double var <- 1 var <- 1.0 Integer var <- 1L

Slide 29

Slide 29 text

Variable type var <- "1" String Float var <- 1.0 Integer var <- 1 var <- "1" Character Double var <- 1 var <- 1.0 Integer var <- 1L

Slide 30

Slide 30 text

Packages packages you

Slide 31

Slide 31 text

Packages library(dplyr) filter(dat, ...) import numpy numpy.array([1:3])

Slide 32

Slide 32 text

Packages library(dplyr) filter(dat, ...) dplyr::filter(dat, ...) import numpy numpy.array([1:3]) import numpy as np np.array([1:3])

Slide 33

Slide 33 text

Packages library(dplyr) filter(dat, ...) dplyr::filter(dat, ...) import numpy numpy.array([1:3]) from numpy import array array([1:3]) import numpy as np np.array([1:3])

Slide 34

Slide 34 text

Loop for(i in 1:10){ i = i + 1 } for i in range(10): i = i + 1 for i in range(10): i = i + 1 for i in range(10): i += 1 for(i in 1:10) i = i + 1 INDENT ERROR One-liner case

Slide 35

Slide 35 text

Function definition f <- function(x, y = 1){ z = x + y return(z) } def f(x, y = 1): z = x + y return z f <- function(x, y = 1){ z = x + y # return(z) } Autoreturn the final expression (z) CHECK YOUR INDENT

Slide 36

Slide 36 text

No content

Slide 37

Slide 37 text

Slide 38

Slide 38 text

reticulate package URL:

Slide 39

Slide 39 text

• Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session. • Translation between R and Python objects (for example, between R and Pandas data frames, or between R matrices and NumPy arrays). reticulate package URL:

Slide 40

Slide 40 text

Environment setup for Python (in macOS Mojave 10.14.3)

Slide 41

Slide 41 text


Slide 42

Slide 42 text


Slide 43

Slide 43 text

Sandbox A Sandbox B

Slide 44

Slide 44 text

Sandbox A Sandbox B Isolated & Independent

Slide 45

Slide 45 text

Isolated & Independent virtual environment for secure and reproducibility "Sandboxed" Python

Slide 46

Slide 46 text

RStudio Projects

Slide 47

Slide 47 text

"Sandboxed" Python Isolated & Independent virtual environment for security & reproducibility [python] version = "3.7" [packages] cycler==0.10.0 kiwisolver==1.1.0 matplotlib==3.1.1 numpy==1.16.4 opencv-python== pandas==0.25.0 pyparsing==2.4.0 PypeR==1.1.2 ... [python] version = "2.7" [packages] numpy==1.16.4 ...

Slide 48

Slide 48 text

Pipenv → "Sandboxed" Python manager $ brew install pipenv Install Pipenv (in MacOS) Install Python

Slide 49

Slide 49 text

Pipenv → "Sandboxed" Python manager $ cd $ pipenv --python 3.7 Create virtualenv .venv Pipfile ← package info ← interpreter, env info

Slide 50

Slide 50 text

Pipenv → "Sandboxed" Python manager (prj) $ exit Deactivate virtualenv $ pipenv --rm Delete virtualenv $ pipenv shell Activate virtualenv

Slide 51

Slide 51 text

Pipenv → "Sandboxed" Python manager (prj) $ pipenv install ~= Install packages $ pipenv shell Activate virtualenv (prj) $ pipenv uninstall Uninstall packages

Slide 52

Slide 52 text

For {reticulate}, you need NumPy package in your virtualenv.

Slide 53

Slide 53 text

Pipenv → "Sandboxed" Python manager $ cd $ pipenv shell (prj) $ pipenv install numpy # activate # install (prj) $ pipenv run pip freeze # check (prj) $ python >>> import numpy # check Install NumPy

Slide 54

Slide 54 text

Pipenv → "Sandboxed" Python manager $ cd $ pipenv shell (prj) $ pipenv --venv /.venv # activate # check Address of the virtualenv

Slide 55

Slide 55 text

back to

Slide 56

Slide 56 text

library(reticulate) pyenv <- "/.venv/bin/python" use_python(python = pyenv, required = TRUE) Use Python in R Install reticulate from CRAN Attach Python virtualenv install.packages(reticulate)

Slide 57

Slide 57 text

Use Python in R Check your Python py_config() ## python: /.venv/bin/python ## libpython: /Library/Frameworks/Python.framework... ## pythonhome: /Library/Frameworks/Python.fram... ## virtualenv: /.venv/bin/ ## version: 3.7.4 (v3.7.4:e09359112e, Jul 8 2019... ## numpy: /.venv/lib/python3.7/site-packages/... ## numpy_version: 1.16.4 ## ## NOTE: Python version was forced by use_python...

Slide 58

Slide 58 text

Use Python in R Import Python pkg in R os <- import("os") Use Python pkg in R os$listdir() ## [1] ".Rhistory" ".DS_Store" ## [2] ".gitignore" ".RData" ## ...

Slide 59

Slide 59 text

Use Python in R Import Python source file source_python("") import pandas as pd def pd_load_csv(path): df = pd.read_csv(path) return df def pd_head(df, n = 3): return df.head(n) File > New File > Python script

Slide 60

Slide 60 text

Use Python in R Import Python source source_python("") dat <- pd_load_csv("hoge.csv") pd_head(dat) iris %>% pd_head

Slide 61

Slide 61 text

source_python("") f_pd <- function(path) pd_load_csv(path) f_base <- function(path) read.csv(path) f_readr <- function(path) readr::read_csv(path) f_fread <- function(path) data.table::fread(path) Benchmark test

Slide 62

Slide 62 text

source_python("") f_pd <- function(path) pd_load_csv(path) f_base <- function(path) read.csv(path) f_readr <- function(path) readr::read_csv(path) f_fread <- function(path) data.table::fread(path) microbenchmark::microbenchmark( pd_load_csv = f_pd(path), read.csv = f_base(path), read_csv = f_readr(path), fread = f_fread(path)) -> mbm ggplot2::autoplot(mbm) Benchmark test

Slide 63

Slide 63 text

Benchmark test

Slide 64

Slide 64 text

Use Python in R Import Python source source_python("") iris %>% pd_head(5) ## y_call_impl(callable, dots$args, dots$keywords) ## でエラー: ## TypeError: cannot do slice indexing on with ## these indexers [5.0] of

Slide 65

Slide 65 text

Variable type var = "1" String Float var = 1.0 Integer var = 1 var <- "1" Character Double var <- 1 var <- 1.0 Integer var <- 1L

Slide 66

Slide 66 text

Use Python in R Import Python source source_python("") iris %>% pd_head(5) iris %>% pd_head(5L) # type ERROR # set as integer

Slide 67

Slide 67 text

• Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session. • Translation between R and Python objects (for example, between R and Pandas data frames, or between R matrices and NumPy arrays). reticulate package URL:

Slide 68

Slide 68 text

• Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session. • Translation between R and Python objects (for example, between R and Pandas data frames, or between R matrices and NumPy arrays). reticulate package URL:

Slide 69

Slide 69 text

Use Python in Rmd File > New File > R markdown Create .Rmd file Import Python virturalenv in R chunk ```{r} library(reticulate) pyenv <- "/.venv/bin/python" use_python(python = pyenv, required = TRUE) ```

Slide 70

Slide 70 text

Use Python in Rmd ```{python} import pandas as pd path = "/sample.csv" df = pd.read_csv(path) df.head(3) ``` Use Python in python chunk

Slide 71

Slide 71 text

Use Python in Rmd Use Python in python chunk ```{python} import pandas as pd path = "/sample.csv" df = pd.read_csv(path) df.head(3) ``` preview

Slide 72

Slide 72 text

Use Python in Rmd Shear pyobj between pychunks ```{python} import pandas as pd path = "/sample.csv" df = pd.read_csv(path) ``` ```{python} df.head(3) ```

Slide 73

Slide 73 text

Use Python in Rmd Import R object to python chunk ```{python} import pandas as pd df = r.iris df.head(3) ```

Slide 74

Slide 74 text

Use Python in Rmd Import R object to python chunk ```{python} import pandas as pd df = r.iris ``` Import python object to R chunk ```{r} py <- import_main() py$df ```

Slide 75

Slide 75 text

```{python} import pandas as pd from time import time path = "/sample.csv" result = [] for i in range(100): start = time() df = pd.read_csv(path) time_i = time() - start result = result.append(time_i) Benchmark test in python chunk

Slide 76

Slide 76 text

```{r} py <- import_main() py$result %>% data.frame(expr = "py_pd", time = .) %>% rbind(data.frame(mbm) %>% mutate(time = time/10^9)) %>% ggplot(aes(expr, log10(time)))+ gem_violin()+ coord_flip() ``` Benchmark visualization in R chunk

Slide 77

Slide 77 text

Benchmark visualization in R chunk

Slide 78

Slide 78 text

• Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session. reticulate package

Slide 79

Slide 79 text

• Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session. reticulate package

Slide 80

Slide 80 text

Run Python on Rstudio library(reticulate) pyenv <- "/.venv/bin/python" use_python(python = pyenv, required = TRUE) 1. Attach Python virtualenv in R File > New File > Python script 2. Create .py file 3. write in .py file a = 1

Slide 81

Slide 81 text

Slide 82

Slide 82 text

No content

Slide 83

Slide 83 text

No content

Slide 84

Slide 84 text

(escape key)

Slide 85

Slide 85 text

No content

Slide 86

Slide 86 text

• Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session. reticulate package

Slide 87

Slide 87 text


Slide 88

Slide 88 text

Projects RStudio

Slide 89

Slide 89 text

"Sandboxed" Python Isolated & Independent virtual environment for security & reproducibility [python] version = "3.7" [packages] cycler==0.10.0 kiwisolver==1.1.0 matplotlib==3.1.1 numpy==1.16.4 opencv-python== pandas==0.25.0 pyparsing==2.4.0 PypeR==1.1.2 ... [python] version = "2.7" [packages] numpy==1.16.4 ...

Slide 90

Slide 90 text

Pipenv → "Sandboxed" Python manager $ cd $ pipenv --python 3.7 Create virtualenv .venv Pipfile ← package info ← interpreter, env info

Slide 91

Slide 91 text

library(reticulate) pyenv <- "/.venv/bin/python" use_python(python = pyenv, required = TRUE) Use Python in R Install reticulate from CRAN Attach Python virtualenv install.packages(reticulate)

Slide 92

Slide 92 text

Use Python in R Import Python source pd <- import("pandas") source_python("") Shear pyobj between pychunks in Rmd ```{python} import pandas as pd path = "/sample.csv" df = pd.read_csv(path)

Slide 93

Slide 93 text

Run Python in Rstudio library(reticulate) pyenv <- "/.venv/bin/python" use_python(python = pyenv, required = TRUE) 1. Attach Python virtualenv in R File > New File > Python script 2. Create .py file 3. write in .py file a = 1

Slide 94

Slide 94 text

• Calling Python from R in a variety of ways including R Markdown, sourcing Python scripts, importing Python modules, and using Python interactively within an R session. reticulate package

Slide 95

Slide 95 text

Input Output Do NOT start from here Whatever … 1 2 3

Slide 96

Slide 96 text

Enjoy!!! KTM

Slide 97

Slide 97 text

Bar dradra KTM

Slide 98

Slide 98 text

No content

Slide 99

Slide 99 text

No content