1 Load packages
2 Seurat object
- 2.1 Download one 10X Genomics Visium dataset and load it into Seurat
- 2.2 Explore the Seurat object
3 Analysis pipeline in Seurat
4 Convert data formats between R and Python
5 Acknowledgements

1 Load packages

Besides Seurat, we need to import some useful packages.

dplyr and ggplot2 are part of tidyverse, a readable and flexible R language for solving data science challenges. I personally prefer the coding style with tidyverse, but you may use base R too.
patchwork combines separate ggplots into the same graphic with easy access to control layouts.
limma is a Bioconductor package to analyze microarray data. It is not essential for our analysis, but may provide a more efficient implementation of the Wilcoxon rank sum test in differential expression analysis.

library(Seurat)
library(dplyr) # data manipulation
library(ggplot2)
library(patchwork)
# install.packages('BiocManager')
# BiocManager::install('limma')
library(limma) # optional

2 Seurat object

We first load one spatial transcriptomics dataset into Seurat, and then explore the Seurat object a bit for single-cell data storage and manipulation. One 10X Genomics Visium dataset will be analyzed with Seurat in this tutorial, and you may explore other dataset sources from various sequencing technologies, and other computational toolkits listed in this (non-exhaustive) resource spreadsheet.

2.1 Download one 10X Genomics Visium dataset and load it into `Seurat`

A spatial gene expression dataset of mouse brain serial section 2 (Sagittal-Posterior) collected by Space Ranger 1.1.0. will be analyzed throughout the tutorial. Both the gene expression matrix and spatial imaging data are necessary for the computational analysis.

For illustration purposes, I have downloaded them from 10x Genomics, and uploaded them to Google Drive. Alternatively, you can download the same files directly from 10x Genomics, or explore other spatial gene expression datasets out there.

The data files we will be using today include:

a (filtered) feature / cell matrix HDF5 file (.h5)
a spatial imaging data folder (.tar.gz)

# specify your working directory
root_dir <- "~/Downloads/" 
setwd(root_dir)
# url prefix to download files
url_prefix <- "https://drive.google.com/uc?export=download&id="
# url IDs are extracted from Google Drive
spatial_folder_id <- "13B6ulZwz9XmWBD1RsnTHTMn3ySgIpFCY"
h5_mat_id <- "1nnch2ctksJ4rXYG5FzlPh3bIgnylTicV"
# file names
spatial_folder_name <- "./V1_Mouse_Brain_Sagittal_Posterior_Section_2_spatial.tar.gz"
h5_mat_name <- "./V1_Mouse_Brain_Sagittal_Posterior_Section_2_filtered_feature_bc_matrix.h5"
# download files from the resource URL to a destination path where the file is saved 
download.file(url = paste0(url_prefix, spatial_folder_id), 
              destfile = file.path(spatial_folder_name))
download.file(url = paste0(url_prefix, h5_mat_id), 
              destfile = file.path(h5_mat_name))
# extract (or list) contents from the tar archives
untar(spatial_folder_name)
untar(spatial_folder_name, list = TRUE)  # list contents

## [1] "spatial/"                          "spatial/tissue_lowres_image.png"  
## [3] "spatial/tissue_positions_list.csv" "spatial/aligned_fiducials.jpg"    
## [5] "spatial/tissue_hires_image.png"    "spatial/scalefactors_json.json"   
## [7] "spatial/detected_tissue_image.jpg"

# Load a 10x Genomics Visium Spatial Experiment into a Seurat object
brain_data <- Seurat::Load10X_Spatial(
  # The directory contains the read count matrix H5 file and the image data in a subdirectory called `spatial`. 
  data.dir = root_dir, 
  filename = h5_mat_name,
  assay = "Spatial", # specify name of the initial assay
  slice = "slice1", # specify name of the stored image
  filter.matrix = TRUE, 
  to.upper = FALSE
)

brain_data

## An object of class Seurat 
## 32285 features across 3289 samples within 1 assay 
## Active assay: Spatial (32285 features, 0 variable features)

2.2 Explore the `Seurat` object

A Seurat object serves as a container that contains both data (like the count matrix) and analysis (like dimension reduction or clustering results) for a single-cell dataset.

2.2.1 Dimension of the active assay

Extract the dimension of the active assay using dim, nrow, or ncol.

dim(x = brain_data) # the number of features (genes) by samples (spots)

## [1] 32285  3289

nrow(x = brain_data) # the number of features
ncol(x = brain_data) # the number of samples

2.2.2 Feature and sample names

Extract the feature and sample names using rownames or colnames.

head(x = rownames(brain_data), n = 5)

## [1] "Xkr4"    "Gm1992"  "Gm19938" "Gm37381" "Rp1"

tail(x = colnames(brain_data), n = 5)

2.2.3 Sample-level metadata

Sample-level metadata is stored as a data.frame, where each row correspond to one sample (e.g. cell or spot) and each column correspond to one sample-level metadata field. It can be accessed via [[ extract operator, the meta.data object, or the $ sigil ($ extracts one single column at a time). Row names in the metadata need to match the column names of the counts matrix.

class(brain_data[[]])

## [1] "data.frame"

colnames(brain_data[[]]) # automatically calculated while creating the Seurat object

## [1] "orig.ident"       "nCount_Spatial"   "nFeature_Spatial"

head(brain_data@meta.data)

##                       orig.ident nCount_Spatial nFeature_Spatial
## AAACAAGTATCTCCCA-1 SeuratProject          14786             4811
## AAACACCAATAACTGC-1 SeuratProject           9009             2748
## AAACAGCTTTCAGAAG-1 SeuratProject           7853             2665
## AAACAGGGTCTATATT-1 SeuratProject          20932             5334
## AAACAGTGTTCCTGGG-1 SeuratProject          42740             8115
## AAACATTTCCCGGATT-1 SeuratProject          10320             3834

brain_data$nCount_Spatial[1:3]

## AAACAAGTATCTCCCA-1 AAACACCAATAACTGC-1 AAACAGCTTTCAGAAG-1 
##              14786               9009               7853

# nFeature_Spatial: the number of unique genes in each sample
sum(brain_data$nFeature_Spatial ==  colSums(brain_data@assays$Spatial@counts > 0))

## [1] 3289

# nCount_Spatial: the total number of detected molecules in each sample
sum(brain_data$nCount_Spatial ==  colSums(brain_data@assays$Spatial@counts))

## [1] 3289

2.2.4 objects(e.g. `Assay`) together with feature-level metadata

# A vector of names of associated objects can be had with the names function
# These can be passed to the double [[ extract operator to pull them from the Seurat object
names(x = brain_data)

## [1] "Spatial" "slice1"

brain_data[['Spatial']] # equivalent to: brain_data@assays$Spatial

## Assay data with 32285 features for 3289 cells
## First 10 features:
##  Xkr4, Gm1992, Gm19938, Gm37381, Rp1, Sox17, Gm37587, Gm37323, Mrpl15,
## Lypla1

brain_data[['slice1']] # equivalent to: brain_data@images$slice1

## Spatial data from the VisiumV1 technology for 3289 samples
## Associated assay: Spatial 
## Image key: slice1_

Each Seurat object consists of one or more Assay objects (basis unit of Seurat) as individual representations of single-cell expression data. Examples of the Assay objects include RNA, ADT in CITE-seq, or Spatial. Each Assay stores multiple slots, including raw (counts), normalized (data) and scaled data (scaled.data) as well as a vector of variable features (var.features) and feature-level metadata (meta.features).

brain_data@assays$Spatial@counts[5:10, 1:3]

## 6 x 3 sparse Matrix of class "dgCMatrix"
##         AAACAAGTATCTCCCA-1 AAACACCAATAACTGC-1 AAACAGCTTTCAGAAG-1
## Rp1                      .                  .                  .
## Sox17                    .                  .                  .
## Gm37587                  .                  .                  .
## Gm37323                  .                  .                  .
## Mrpl15                   1                  .                  1
## Lypla1                   1                  .                  .

Feature-level metadata is associated with each individual assay. Feature-level metadata can be accessed through the double bracket [[ extract operator on the Assay objects, or the meta.features slot.

brain_data[['Spatial']]@meta.features
head(brain_data[['Spatial']][[]])

Other objects containing the single-cell data analysis results will be discussed later after going through the analysis pipeline.

3 Analysis pipeline in Seurat

The general steps to perform preprocessing, dimensiona reduction and clustering for spatial transcriptomics data are quite similar to the Seurat workflow analyzing single-cell RNA sequencing data.

3.1 Preprocess data

A standard preprocessing workflow includes the selection and filtration of cells based on quality control (QC) metrics, data normalization and (optional) scaling, and the detection of highly variable features.

3.1.1 Quality control

A few common QC metrics include

The number of unique genes detected in each sample (nFeature_Spatial).
The total number of molecules detected within a sample (nCount_Spatial).
The percentage of reads that map to the mitochondrial genome.

brain_data[["percent.mt"]] <- PercentageFeatureSet(brain_data, 
                                                   pattern = "^mt-")
VlnPlot(
  brain_data, features = c("nFeature_Spatial", "nCount_Spatial", "percent.mt"), 
  pt.size = 0.1, ncol = 3) & 
  theme(axis.title.x = element_blank(),
        axis.text.x = element_blank(),
        axis.ticks.x = element_blank())

# Jointly (rather than separately) consider the QC metrics when filtering
plot1 <- FeatureScatter(
  brain_data, feature1 = "nCount_Spatial", feature2 = "percent.mt") + NoLegend()
plot2 <- FeatureScatter(
  brain_data, feature1 = "nCount_Spatial", feature2 = "nFeature_Spatial") +
  NoLegend()
plot1 + plot2

brain_subset <- subset(
  brain_data, 
  subset = nFeature_Spatial < 8000 & nFeature_Spatial > 1000 & 
    nCount_Spatial < 50000 & percent.mt < 30)

print(paste("Filter out", ncol(brain_data) - ncol(brain_subset), 
            "samples because of the outlier QC metrics, with", ncol(brain_subset),
            "samples left."))

## [1] "Filter out 96 samples because of the outlier QC metrics, with 3193 samples left."

3.1.2 Normalization

The variance of molecular counts expresses spatial heterogeneity which cannot be solely explained by technical noise. Satija Lab and Collaborators recommends normalization using SCTransform (Hafemeister and Satija, 2019) in order to account for technical bias while preserving true biological differences.

SpatialFeaturePlot(
  brain_subset, features = c("nFeature_Spatial", "nCount_Spatial", "percent.mt")) &
  theme(legend.position = "bottom")

brain_norm <- SCTransform(brain_subset, assay = "Spatial", verbose = FALSE)

names(brain_norm)

## [1] "Spatial" "SCT"     "slice1"

dim(brain_norm@assays$SCT@counts)

## [1] 17980  3193

dim(brain_norm@assays$SCT@data)

## [1] 17980  3193

dim(brain_norm@assays$SCT@scale.data)

## [1] 3000 3193

SCTransform returns the Seurat object with a new assay called SCT, where its counts slot stores the corrected UMI counts, the data slot stores the log-normalized version of the corrected UMI counts, and scale.data slot stores the pearson residuals (normalized values) and is used as PCA input.

The scale.data slot in output assay are subset to contain only the variable genes with return.only.var.genes = TRUE by default.

3.2 Downstream tasks (dimension reduction, data visualization, cluster annotation, differential expression)

brain_obj <- RunPCA(brain_norm, assay = "SCT", verbose = FALSE)
# compute K nearest neighbors (KNN)
brain_obj <- FindNeighbors(brain_obj, reduction = "pca", dims = 1:30)
# Leiden algorithm for community detection
brain_obj <- FindClusters(brain_obj, verbose = FALSE)
# PCA result is the default UMAP input, use dimensions 1:30 as input features
brain_obj <- RunUMAP(brain_obj, reduction = "pca", dims = 1:30)

plot3 <- DimPlot(brain_obj, reduction = "umap", label = TRUE) + NoLegend()
plot4 <- SpatialDimPlot(brain_obj, label = TRUE, label.size = 3) + NoLegend()
plot3 + plot4

The reductions object stores a dimensionality reduction taken out in Seurat; each slot in reductions consists of a cell embeddings matrix, a feature loadings matrix, and a projected feature loadings matrix.

brain_obj@reductions

## $pca
## A dimensional reduction object with key PC_ 
##  Number of dimensions: 50 
##  Projected dimensional reduction calculated:  FALSE 
##  Jackstraw run: FALSE 
##  Computed using assay: SCT 
## 
## $umap
## A dimensional reduction object with key UMAP_ 
##  Number of dimensions: 2 
##  Projected dimensional reduction calculated:  FALSE 
##  Jackstraw run: FALSE 
##  Computed using assay: SCT

Ordinary differential expression analysis does not take spatial information into account, and thus it is generally applicable for both scRNA-seq data and spatial transcriptomics data to explore cellular heterogeneity of cell types and states.

# identity class of each sample
table(brain_obj@active.ident)

## 
##   0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16 
## 414 404 301 282 227 210 210 195 191 160 140 129 113  74  70  43  30

# find all markers of cluster 10
cluster10_markers <- FindMarkers(brain_obj, ident.1 = 10, min.pct = 0.25)
head(cluster10_markers, n = 5)

##               p_val avg_log2FC pct.1 pct.2     p_val_adj
## Esr1  4.302916e-154   0.645829 0.471 0.021 7.736643e-150
## Trh   1.336079e-140   2.235272 0.557 0.041 2.402270e-136
## Ccn3  8.237450e-118   2.262304 1.000 0.253 1.481094e-113
## Pcdh8 8.330506e-115   1.562598 0.979 0.237 1.497825e-110
## Trhr  1.723452e-114   0.752990 0.543 0.048 3.098767e-110

VlnPlot(brain_obj, features = c("Esr1", "Trh", "Ccn3"))

SpatialFeaturePlot(object = brain_obj, 
                   features = rownames(cluster10_markers)[1:3], 
                   alpha = c(0.1, 1), ncol = 3)

The following code chunk is not evaluated due to computational time constraints, but you are encouraged to give it a try if interested.

# find markers for every cluster compared to all remaining cells, 
# report only the positive ones
# this code chunk is not evaluated for now because of time constraints
brain_obj_markers <- FindAllMarkers(brain_obj, only.pos = TRUE, min.pct = 0.25, 
                                    logfc.threshold = 0.25)
brain_obj_markers %>%
    group_by(cluster) %>%
    slice_max(n = 2, order_by = avg_log2FC)
top10 <- brain_obj_markers %>%
  group_by(cluster) %>%
  top_n(n = 10, wt = avg_log2FC)
DoHeatmap(brain_obj, features = top10$gene) + NoLegend()

3.3 Identify spatially variable genes

Seurat is able to calculate two metrics in order for the users to identify spatially variable genes, namely Moran’s I statistic and marker-variogram. Spatial coordinates of the samples are incorporated to identify features with spatial heterogeneity in its expression.

3.3.1 Moran’s I

Moran’s I is a global metric measuring the correlation of gene expression values between local observed values and the average of neighboring values. We can interpret Moran’s I as a spatial autocorrelation metric similar to the Pearson correlation coefficient in the context of spatial statistics. We can either calculate the z-score under normality assumption or perform permutation test to see whether we can reject the null hypothesis of zero spatial autocorrelation.

\[\begin{align*} \text{Moran's I} = \frac{N}{\sum_{i, j}W_{ij}}\frac{\sum_{i}\sum_{j}W_{ij}(x_i - \bar{x})(x_j - \bar{x})}{\sum_{i}(x_i - \bar{x})^2} \end{align*}\] , where $N$ is the total number of spatial location units indexed by $(i, j)$, and $W$ is a weight matrix to be discussed below. Recall that the Pearson correlation coefficient is \[\begin{align*} r= \frac{\sum_{i=1}^{n}(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_i - \bar{x})^2}\sqrt{\sum_{i=1}^{n}(y_i - \bar{y})^2}} \end{align*}\]

Fig: Interpretation of Moran’s I statistic, image source.

Our assumptions about the spatial neighboring relationship can be encoded with a weight matrix $W$, where $W_{i, j}$ can be either contiguity-based, or distance-based (inverse distance). For instance, the following figure illustrates potential cases for higher-order contiguity, when a first-order contiguity is determined based on the existence of a shared boundary or border.

Fig: Examples of contiguity-based spatial neighbors, image source.

brain_moransi <- FindSpatiallyVariableFeatures(
  brain_obj, assay = "SCT", 
  features = VariableFeatures(brain_obj)[1:10],
    selection.method = "moransi")

## Computing Moran's I

## For a more efficient implementation of the Morans I calculation,
## (selection.method = 'moransi') please install the Rfast2 package
## --------------------------------------------
## install.packages('Rfast2')
## --------------------------------------------
## After installation of Rfast2, Seurat will automatically use the more 
## efficient implementation (no further action necessary).
## This message will be shown once per session

moransi_output_df <- brain_moransi@assays$SCT@meta.features %>%
  na.exclude
head(moransi_output_df[order(moransi_output_df$MoransI_observed, decreasing = T), ])

##       MoransI_observed MoransI_p.value moransi.spatially.variable
## Pcp2         0.6926961               0                       TRUE
## Car8         0.5745839               0                       TRUE
## Ttr          0.5635015               0                       TRUE
## Mbp          0.4965477               0                       TRUE
## Plp1         0.4491421               0                       TRUE
## Ptgds        0.2900585               0                       TRUE
##       moransi.spatially.variable.rank
## Pcp2                                1
## Car8                                2
## Ttr                                 3
## Mbp                                 4
## Plp1                                5
## Ptgds                               6

top_features_moransi <- head(
  SpatiallyVariableFeatures(brain_moransi, 
                            selection.method = "moransi"), 3)
SpatialFeaturePlot(brain_moransi, 
                   features = top_features_moransi, ncol = 3, alpha = c(0.1, 1)) + 
  plot_annotation(
  title = "Top 3 genes with the largest Moran's I",
  subtitle = "among 10 top variable genes for illustration purposes")

bottom_features_moransi <- tail(
  SpatiallyVariableFeatures(brain_moransi, 
                            selection.method = "moransi"), 3)
SpatialFeaturePlot(brain_moransi, 
                   features = bottom_features_moransi, ncol = 3, alpha = c(0.1, 1)) + 
  plot_annotation(
  title = "Bottom 3 genes with the smallest Moran's I",
  subtitle = "among 10 top variable genes for illustration purposes")

3.3.2 marker-variogram

A gene-specific empirical variogram $v$ for gene $g$ can be calculated as a function of pair-wise distance $r$ between any two samples $i$ and $j$. A permutation test can be performed to see whether such variogram $v(r)$ is associated with the spatial distance $r$ in trendsceek.

\[\begin{align*} v(r) = \mathbb{E}_{\forall i, j, \text{ s.t. dist}(i, j) = r}\left[(x_i - x_j)^2\right] \end{align*}\] , where $x_i, x_j$ stand for the gene expression of gene $g$ in sample $i, j$ respectively.

Fig: Common theoretical variogram models, image source.

brain_variogram <- FindSpatiallyVariableFeatures(
  brain_obj, assay = "SCT", 
  features = VariableFeatures(brain_obj)[1:10],
    selection.method = "markvariogram")  
variogram_output_df <- brain_variogram@assays$SCT@meta.features %>%
  na.exclude # there are NA rows b/c we only calculated the variogram for 10 genes
head(variogram_output_df[order(variogram_output_df$r.metric.5), ])

##       r.metric.5 markvariogram.spatially.variable
## Ttr   0.08242632                             TRUE
## Pcp2  0.08359531                             TRUE
## Mbp   0.15537902                             TRUE
## Car8  0.16478140                             TRUE
## Plp1  0.25534753                             TRUE
## Ptgds 0.35968268                             TRUE
##       markvariogram.spatially.variable.rank
## Ttr                                       1
## Pcp2                                      2
## Mbp                                       3
## Car8                                      4
## Plp1                                      5
## Ptgds                                     6

The column named r.metric.5 stores the expected values of mark-variogram at a given r (radius) value ($r = 5$ by default) when the marks attached to different points are independent (i.e., there is no spatial autocorrelation).

top_features_variogram <- head(
  SpatiallyVariableFeatures(brain_variogram, 
                            selection.method = "markvariogram"), 3)
SpatialFeaturePlot(brain_variogram, 
                   features = top_features_variogram, ncol = 3, alpha = c(0.1, 1)) + 
  plot_annotation(
  title = "3 genes with the top spatially variable rank (by mark-variogram)",
  subtitle = "among 10 top variable genes for illustration purposes")

bottom_features_variogram <- tail(
  SpatiallyVariableFeatures(brain_variogram, 
                            selection.method = "markvariogram"), 3)
SpatialFeaturePlot(brain_variogram, 
                   features = bottom_features_variogram, ncol = 3, alpha = c(0.1, 1)) + 
  plot_annotation(
  title = "3 genes with the bottom spatially variale rank (by mark-variogram)",
  subtitle = "among 10 top variable genes for illustration purposes")

3.3.3 Other R packages to identify spatially variable genes

spatialDE
- Bioconductor package: https://www.bioconductor.org/packages/release/bioc/html/spatialDE.html
- vignette: https://www.bioconductor.org/packages/release/bioc/vignettes/spatialDE/inst/doc/spatialDE.html
trendsceek
- GitHub repository with a demo: https://github.com/edsgard/trendsceek
SPARK together with its updated version SPARK-X
- GitHub repository: https://xzhoulab.github.io/SPARK/
- an analysis example: https://xzhoulab.github.io/SPARK/02_SPARK_Example/
MERINGUE
- GitHub repository: https://github.com/JEFworks-Lab/MERINGUE
- a vignette analyzing mouse olfactory bulb data: https://github.com/JEFworks-Lab/MERINGUE/blob/master/docs/mOB_analysis.md

4 Convert data formats between R and Python

To convert single-cell data objects between R (e.g. Seurat or SingleCellExperiment) and Python (e.g. AnnData), you can use either SeuratDisk or sceasy. Another option is to use reticulate, which could be a bit more involved and take more memory.

SeuratDisk vignette to convert between Seurat and AnnData: https://mojaveazure.github.io/seurat-disk/articles/convert-anndata.html
sceasy to convert single-cell datasets across different formats: https://github.com/cellgeni/sceasy
reticulate vignette to call Python from R and convert data formats: https://github.com/cellgeni/sceasy

5 Acknowledgements

The tutorial was inspired by the computational assignments I created together with Dr. Yun S. Song in a graduate course (CMPBIO 290: Algorithms for single-cell genomics) at University of California, Berkeley in Fall 2021, and it was largely based on many open-source materials, especially the Seurat tutorials from the Satija Lab. I would also like to thank Salwan Butrus for helpful feedback and suggestions.

References

Seurat essential commands list: https://satijalab.org/seurat/articles/essential_commands.html
Seurat GitHub Wiki: https://github.com/satijalab/seurat/wiki
Luecken M. K. and Theis F. J., Current best practices in single-cell RNA-seq analysis: a tutorial. Mol Syst Biol (2019) (doi: <10.15252/msb.20188746>)
Analysis, visualization, and integration of spatial datasets with Seurat: https://satijalab.org/seurat/articles/spatial_vignette.html
Seurat guided clustering tutorial for scRNA-seq data: https://satijalab.org/seurat/articles/pbmc3k_tutorial.html

sessionInfo()
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] limma_3.48.3       patchwork_1.1.1    ggplot2_3.3.5      dplyr_1.0.7       
## [5] SeuratObject_4.0.2 Seurat_4.0.3      
## 
## loaded via a namespace (and not attached):
##   [1] Rtsne_0.15            colorspace_2.0-2      deldir_0.2-10        
##   [4] ellipsis_0.3.2        ggridges_0.5.3        spatstat.data_2.1-0  
##   [7] farver_2.1.0          leiden_0.3.9          listenv_0.8.0        
##  [10] bit64_4.0.5           ggrepel_0.9.1         RSpectra_0.16-0      
##  [13] fansi_0.5.0           codetools_0.2-18      splines_4.1.0        
##  [16] knitr_1.33            polyclip_1.10-0       jsonlite_1.7.2       
##  [19] ica_1.0-2             cluster_2.1.2         png_0.1-7            
##  [22] uwot_0.1.10           shiny_1.6.0           sctransform_0.3.2    
##  [25] spatstat.sparse_2.0-0 compiler_4.1.0        httr_1.4.2           
##  [28] assertthat_0.2.1      Matrix_1.3-4          fastmap_1.1.0        
##  [31] lazyeval_0.2.2        later_1.2.0           htmltools_0.5.1.1    
##  [34] tools_4.1.0           igraph_1.2.6          gtable_0.3.0         
##  [37] glue_1.4.2            RANN_2.6.1            reshape2_1.4.4       
##  [40] Rcpp_1.0.7            scattermore_0.7       jquerylib_0.1.4      
##  [43] vctrs_0.3.8           ape_5.5               nlme_3.1-152         
##  [46] lmtest_0.9-38         xfun_0.24             stringr_1.4.0        
##  [49] globals_0.14.0        mime_0.11             miniUI_0.1.1.1       
##  [52] lifecycle_1.0.1       irlba_2.3.3           goftest_1.2-2        
##  [55] future_1.21.0         MASS_7.3-54           zoo_1.8-9            
##  [58] scales_1.1.1          spatstat.core_2.3-0   promises_1.2.0.1     
##  [61] spatstat.utils_2.2-0  parallel_4.1.0        RColorBrewer_1.1-2   
##  [64] yaml_2.2.1            reticulate_1.20       pbapply_1.4-3        
##  [67] gridExtra_2.3         sass_0.4.0            rpart_4.1-15         
##  [70] stringi_1.7.3         highr_0.9             rlang_0.4.11         
##  [73] pkgconfig_2.0.3       matrixStats_0.60.0    evaluate_0.14        
##  [76] lattice_0.20-44       ROCR_1.0-11           purrr_0.3.4          
##  [79] tensor_1.5            labeling_0.4.2        htmlwidgets_1.5.3    
##  [82] bit_4.0.4             cowplot_1.1.1         tidyselect_1.1.1     
##  [85] parallelly_1.27.0     RcppAnnoy_0.0.19      plyr_1.8.6           
##  [88] magrittr_2.0.1        R6_2.5.1              generics_0.1.0       
##  [91] DBI_1.1.1             pillar_1.6.3          withr_2.4.2          
##  [94] mgcv_1.8-36           fitdistrplus_1.1-5    survival_3.2-11      
##  [97] abind_1.4-5           tibble_3.1.4          future.apply_1.7.0   
## [100] crayon_1.4.1          hdf5r_1.3.3           KernSmooth_2.23-20   
## [103] utf8_1.2.2            spatstat.geom_2.2-2   plotly_4.9.4.1       
## [106] rmarkdown_2.9         grid_4.1.0            data.table_1.14.0    
## [109] digest_0.6.28         xtable_1.8-4          tidyr_1.1.3          
## [112] httpuv_1.6.1          munsell_0.5.0         viridisLite_0.4.0    
## [115] bslib_0.2.5.1

Single-cell and spatial transcriptomics data analysis with Seurat in R

Yutong Wang, Graduate Group in Biostatistics, University of California, Berkeley

11/10/2021

1 Load packages

2 Seurat object

2.1 Download one 10X Genomics Visium dataset and load it into `Seurat`

2.2 Explore the `Seurat` object

2.2.1 Dimension of the active assay

2.2.2 Feature and sample names

2.2.3 Sample-level metadata

2.2.4 objects(e.g. `Assay`) together with feature-level metadata

3 Analysis pipeline in Seurat

3.1 Preprocess data

3.1.1 Quality control

3.1.2 Normalization

3.2 Downstream tasks (dimension reduction, data visualization, cluster annotation, differential expression)

3.3 Identify spatially variable genes

3.3.1 Moran’s I

3.3.2 marker-variogram

3.3.3 Other R packages to identify spatially variable genes

4 Convert data formats between R and Python

5 Acknowledgements

Single-cell and spatial transcriptomics data analysis with Seurat in R

Yutong Wang, Graduate Group in Biostatistics, University of California, Berkeley

11/10/2021

1 Load packages

2 Seurat object

2.1 Download one 10X Genomics Visium dataset and load it into Seurat

2.2 Explore the Seurat object

2.2.1 Dimension of the active assay

2.2.2 Feature and sample names

2.2.3 Sample-level metadata

2.2.4 objects(e.g. Assay) together with feature-level metadata

3 Analysis pipeline in Seurat

3.1 Preprocess data

3.1.1 Quality control

3.1.2 Normalization

3.2 Downstream tasks (dimension reduction, data visualization, cluster annotation, differential expression)

3.3 Identify spatially variable genes

3.3.1 Moran’s I

3.3.2 marker-variogram

3.3.3 Other R packages to identify spatially variable genes

4 Convert data formats between R and Python

5 Acknowledgements

2.1 Download one 10X Genomics Visium dataset and load it into `Seurat`

2.2 Explore the `Seurat` object

2.2.4 objects(e.g. `Assay`) together with feature-level metadata