Slide 1

Slide 1 text

Interoperability between Bioconductor and Python for scRNA-seq analysis Luke Zappia @_lazappi_

Slide 2

Slide 2 text

What is interoperability? “Ability to quickly and easily switch between languages/platforms as required”

Slide 3

Slide 3 text

Why interoperability? 1. Take advantage of strengths 2. Make use of existing packages 3. Avoid unnecessary reimplementation

Slide 4

Slide 4 text

Bulk RNA-seq analysis

Slide 5

Slide 5 text

scRNA-seq analysis Seurat CRAN

Slide 6

Slide 6 text

Ecosystems

Slide 7

Slide 7 text

How?

Slide 8

Slide 8 text

{reticulate} {basilisk} B Python environments R/Python interface scRNA-seq objects Velocity analysis

Slide 9

Slide 9 text

Disclaimer Most (almost all) of this is not my work Package Developer @GitHub Python Alternative

Slide 10

Slide 10 text

{reticulate} Kevin Ushey @kevinushey J.J. Allaire @jjallaire Yuan Tang @terrytangyuan RStudio rstudio.org rpy2 install.packages("reticulate")

Slide 11

Slide 11 text

library(reticulate) # Set Python environment > use_python("/path/to/my/python") # use_virtualenv("my_venv") # use_condaenv("my_conda_env") # Import Python libraries > pandas w- import("pandas") # Implicitly convert between R and Python > pandas$DataFrame(data = list("Col1" = 1:2, "Col2" = 3:4)) Col1 Col2 1 1 3 2 2 4 # Explicitly convert between R and Python > vec w- 1:4 > vec [1] 1 2 3 4 > py_list w- r_to_py(vec) > py_list [1, 2, 3, 4] > py_to_r(py_list) [1] 1 2 3 4 {reticulate} in R

Slide 12

Slide 12 text

```{r} # A normal R chunk vec w- 1:4 vec ``` [1] 1 2 3 4 ```{python} # A native Python chunk ls = [5, 6, 7, 8] ls ``` [1, 2, 3, 4] ```{r} # Access Python from R mean(py$ls) ``` 6.5 ```{python} # Access R from Python sum(r.vec) / len(r.vec) ``` 2.5 {reticulate} in R Markdown

Slide 13

Slide 13 text

Conversion R Python Single-element vector Scalar Multi-element vector List List of multiple types Tuple Named list Dict Matrix/Array NumPy array data.frame Pandas DataFrame Function Python function NULL, TRUE, FALSE None, True, False

Slide 14

Slide 14 text

Limitations Manage Python environment Familiarity with Python syntax Only supports common data structures

Slide 15

Slide 15 text

{basilisk} Aaron Lun @LTLA Image from Ipipipourax via WikiMedia Commons (CC BY-SA 3.0) https://commons.wikimedia.org/wiki/File:Basilik_color%C3%A9.jpg BiocManagerw:install("basilisk")

Slide 16

Slide 16 text

my_env w- basiliskw:BasiliskEnvironment( envname = "my_env", pkgname = "myPkg", packages = c("pandasw=1.1.2", ww.) ) my_py_fun w- function(ww.) { pandas w- import("pandas") ww. return(output) } my_r_fun w- function(ww.) { output w- basiliskw:basiliskRun( env = my_env, fun = my_py_fun, ww. ) } library(myPkg) output w- my_r_fun(ww.) Set up Python (Conda) environment (first time)... Run my_py_fun() in the environment... Return output {basilisk} 1. Define an environment 2. Create a {reticulate} function 2. Wrap the function in the environment Developer User

Slide 17

Slide 17 text

Advantages User doesn’t require Python code Automatic environment creation Different environments/dependencies for each package

Slide 18

Slide 18 text

{zellkonverter} Aaron Lun @LTLA Luke Zappia @lazappi anndata2ri BiocManagerw:install("zellkonverter")

Slide 19

Slide 19 text

{basilisk} .h5ad file readH5AD() AnnData2SCE() SingleCellExperiment ... .h5ad file AnnData {basilisk} writeH5AD() SCE2AnnData() AnnData AnnData2SCE() SingleCellExperiment SCE2AnnData() AnnData AnnData

Slide 20

Slide 20 text

No content

Slide 21

Slide 21 text

{anndata} Robrecht Cannoodt @rcannood {sceasy} Vladimir Kiselev @wikiselev Ni Huang @nh3 install.packages("anndata") anndata remotesw:install_github("sceasy")

Slide 22

Slide 22 text

{velociraptor} Kevin Rue-Albrecht @kevinrue Aaron Lun @LTLA Charlotte Soneson @csoneson scvelo BiocManagerw:install("velociraptor")

Slide 23

Slide 23 text

{basilisk} SingleCellExperiment scvelo() AnnData2SCE() AnnData scv.tl.velocity(...) scv.tl.latent_time(...) ... AnnData X SingleCellExperiment

Slide 24

Slide 24 text

scVelo Volker Bergen @Volker Bergen pip install scvelo

Slide 25

Slide 25 text

RNA velocity

Slide 26

Slide 26 text

Dynamical RNA velocity

Slide 27

Slide 27 text

No content

Slide 28

Slide 28 text

No content

Slide 29

Slide 29 text

CellRank Marius Lange @Marius1311 pip install cellrank

Slide 30

Slide 30 text

No content

Slide 31

Slide 31 text

No content

Slide 32

Slide 32 text

No content

Slide 33

Slide 33 text

Pancreas development

Slide 34

Slide 34 text

Summary Interoperability between Bioconductor and Python is already possible {zellkonverter} converts between SingleCellExperiment and AnnData objects scVelo and CellRank for analysis of dynamic processes

Slide 35

Slide 35 text

Thanks! Luke Zappia @_lazappi_ @lazappi lazappi.id.au scvelo.org cellrank.org Theis Lab @fabian_theis @ICBmunich www.comp.bio