Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Team RStats Presentation

Team RStats Presentation

#R programming #sc-RNAseq #leiden #clustering #BPPARAM #parallelisation #Multicore #PCA

Avatar for Manisha Barse

Manisha Barse

April 24, 2026

More Decks by Manisha Barse

Other Decks in Programming

Transcript

  1. Manisha Barse April 24, 2026 Optimizing Runtime In Single-Cell Workflow

    Through Parallelization Of Key Steps Team RStats Presentation
  2. Workflow: QC to Spatial Registration in sc-RNAseq Alignment to reference

    genome cellranger Quality Control at cell level Identify and exclude empty droplets DropletUtils High mitochondrial expr ratio Low total gene counts Identify and exclude outliers if cell meets any 1 of 3 criteria Low total genes detected Identify and drop doublets Cluster registration to reference layer_stat_cor() Cluster annotation (DeconvoBuddies::findMarkers_1vALL) Dimensionality Reduction Batch Correction Harmony Clustering bluster package
  3. Dimensionality Reduction library("BiocParallel") ncores <- Sys.getenv('SLURM_CPUS_ON_NODE') |> as.numeric() bp <-

    MulticoreParam(workers = ncores) QC’d sce object Run Deviance Feature Selection Run nullResiduals Get the HDGs runPCA(), runUMAP(), runTSNE() message(Sys.time(), " - running PCA - ") sce <- runPCA(sce, exprs_values = "binomial_deviance_residuals", subset_row = hdgs, ncomponents = 100, # default=50 name = "GLMPCA_approx", BSPARAM = BiocSingular::IrlbaParam(), #Fast approx SVD algo BPPARAM = bp ) message(Sys.time(), " - running TSNE") sce <- runTSNE(sce, dimred = "GLMPCA_approx", BPPARAM = bp) message(Sys.time(), " - running UMAP") sce <- runUMAP(sce, dimred = "GLMPCA_approx", BPPARAM = bp)
  4. Batch Correction Harmony RunHarmony() runUMAP() , runTSNE() message(Sys.time(), " -

    running TSNE") sce <- runTSNE(sce, dimred = "HARMONY", BPPARAM = bp) message(Sys.time(), " - running UMAP") sce <- runUMAP(sce, dimred = "HARMONY", BPPARAM = bp) message("Running Harmony - ", Sys.time()) sce <- RunHarmony(sce, group.by.vars = correction, verbose = TRUE) library("BiocParallel") ncores <- Sys.getenv('SLURM_CPUS_ON_NODE') |> as.numeric() bp <- MulticoreParam(workers = ncores)
  5. Leiden Clustering #Extract the HARMONY matrix before parallel X <-

    as.matrix(reducedDim(sce, "HARMONY")[,1:50]) rm(sce) ##safe to optimise memory sweep.k.resolution <- bplapply(seq_len(nrow(params)), function(i) { clusterRows( X, NNGraphParam( k = params$k[i], cluster.fun = "leiden",## other options: walktrap, louivan cluster.args = list(resolution = params$resolution[i]) )) }, BPPARAM = BiocParallel::MulticoreParam(4)) IMPORTANT: Balance speed with memory overhead library("bluster") params <- expand.grid( k = c(20, 25, 30), resolution = seq(0.1, 1.0, by = 0.1) )
  6. DeconvoBuddies::findMarkers_1vALL raw_logFC: whether to also return non-standardized logFC values in

    addition to the std values. Note: setting this to TRUE roughly doubles run time. library("BiocParallel") library("DeconvoBuddies") ncores <- Sys.getenv('SLURM_CPUS_ON_NODE') |> as.numeric() bp <- MulticoreParam(workers = ncores) annotation <- findMarkers_1vAll( sce, assay_name = "logcounts", cellType_col = "cellType", #input_k_resolution, BPPARAM = bp ) *raw_logFC: default FALSE Calculate 1 vs. All standard fold change for each gene x cell type