Workflow Utilities

spacedeconv offers a variety of workflow helper functions that streamline the overall analysis process. In the following we will give an overview over the available functions.

preprocess
normalize
print_info
available_results
aggregate_results
addCustomAnnotation
annotate_spots
scale_cell_counts
subsetSCE
subsetSPE

library(spacedeconv)

## → checking spacedeconv environment and dependencies

1. `preprocess`

The function can be used to preprocess single-cell or spatial datasets. The cuts of low and high UMI observations, removes noisy expression and performs additional checks to streamline the deconvolution analysis. The functions takes a SingleCellExperiment, AnnData or Seurat and returns a processed SingleCellExperiment. min_umi or max_umi parameters can be set to improve data quality. The assay can be selected with the assay parameter. Additionally Mitochondria Genes can be removed by setting remove_mito=TRUE.

data("single_cell_data_3")
sce <- spacedeconv::preprocess(single_cell_data_3, min_umi = 500, assay = "counts", remove_mito = TRUE)
## ── spacedeconv ─────────────────────────────────────────────────────────────────
## ℹ testing parameter
## ✔ parameter OK [64ms]
## 
## ℹ Removing 8 observations with umi count below threshold
## ✔ Removed 8 observations with umi count below threshold [859ms]
## 
## ℹ Removing 5862 variables with all zero expression
## ✔ Removed 5862 variables with all zero expression [828ms]
## 
## ℹ Removing 13 mitochondria genes
## ✔ Removed 13 mitochondria genes [792ms]
## 
## ℹ Checking for ENSEMBL Identifiers
## ! Warning: ENSEMBL identifiers detected in gene names
## ℹ Checking for ENSEMBL Identifiersℹ Consider using Gene Names for first-generation deconvolution tools
## ℹ Checking for ENSEMBL Identifiers✔ Finished Preprocessing [7ms]

2. `normalize`

You can scale and normalize your single-cell or spatial data by calling the normalize function. The function takes a method parameter where cpmor logcpmcan be selected. The normalized data is stored as an additional assay in the object.

sce <- spacedeconv::normalize(sce, method = "cpm", assay = "counts")
## ── spacedeconv ─────────────────────────────────────────────────────────────────
## ℹ testing parameter
## ✔ parameter OK [12ms]
## 
## ℹ Normalizing using cpm
## Warning in asMethod(object): sparse->dense coercion: allocating vector of size
## 1.4 GiB
## ✔ Finished normalization using cpm [4.5s]
## 
## ℹ Please note the normalization is stored in an additional assay

3. `print_info`

You can obtain additional info about your dataset by calling print_info.

print_info(sce)
## 
## ── Single Cell
## Assays: "counts" and "cpm"
## Genes: 23858
## → without expression: 0 (0%)
## Cells: 7978
## → without expression: 0 (0%)
## Umi count range: 447 - 74244
## ✔ Rownames set
## ✔ Colnames set

4. `available_results`

You can check what deconvolution results and additional annotation is available in your data by calling available_resutls. You can set the method parameter to the name of a deconvolution tool to further filter the results if many quantifications where performed.

# "deconv" contains DWLS results
available_results(deconv)

## [1] "dwls_B.cells"           "dwls_CAFs"              "dwls_Cancer.Epithelial"
## [4] "dwls_Endothelial"       "dwls_Myeloid"           "dwls_Normal.Epithelial"
## [7] "dwls_Plasmablasts"      "dwls_PVL"               "dwls_T.cells"

5. `aggregate_results`

You can aggregate fine-grained deconvolution results to a single value by providing a list of deconvolution result names to the cell_types parameter. You can additionally set a new name and you have the option to remove the original fine-grained columns and just keep the aggregation.

aggregate_results(deconv, cell_types = c("dwls_Cancer.Epithelial", "dwls_Normal.Epithelial"), name = "dwls_Epithelial", remove = TRUE)
## ── spacedeconv ─────────────────────────────────────────────────────────────────
## ℹ testing parameter
## ✔ parameter OK [4ms]
## 
## ℹ Aggregating cell types
## ✔ Aggregated cell types [6ms]
## 
## class: SpatialExperiment 
## dim: 23542 1185 
## metadata(0):
## assays(2): counts cpm
## rownames(23542): AL627309.1 AL627309.5 ... AC007325.4 AC007325.2
## rowData names(2): symbol ensembl
## colnames(1185): AAACAATCTACTAGCA-1 AAACACCAATAACTGC-1 ...
##   TTGTTTCATTAGTCTA-1 TTGTTTGTGTAAATTC-1
## colData names(12): in_tissue array_row ... dwls_T.cells dwls_Epithelial
## reducedDimNames(0):
## mainExpName: NULL
## altExpNames(0):
## spatialCoords names(2) : pxl_col_in_fullres pxl_row_in_fullres
## imgData names(4): sample_id image_id data scaleFactor

6. `addCustomAnnotation`

This function helps adding a custom annotation vector to a SpatialExperiment object.

# newAnnotation is a vector containing custom annotation for each spot
spe <- addCustomAnnotation(spe, columnName = "ManualAnnotation", values = new_annotation)

7. `annotate_spots`

This function is able to annotate spots with TRUE / FALSE if you want to classify a specific subgroup of spots. It takes a list of spots that should be classified as TRUE, setting all other spots to FALSE.

# spots is a list of spot names.
spe <- annotate_spots(spe, spots, value_pos = TRUE, value_neg = FALSE, name = "customAnnotation")

8. `scale_cell_counts`

Most deconvolution tools compute relative cell fractions for spots. If you have cell counts for each spot you can scale the relative values to absolute cell counts using this function. The input parameters are the column name that should be scaled value and a vector of absolute cell counts for each spot cell_counts. You can also set a new resName.

# cell_counts_per_spot contains spot level absolute cell counts
spe_absolute <- scale_cell_counts(spe, value = "dwls_B.cells", cell_counts = cell_counts_per_spot, resName = "BCellsAbsolute")

9. `subsetSCE`

To improve resource requirements for deconvolution computation you can reduce your input scRNA-seq reference size by subsetting. The functions requires your input sce object, the column name containing the cell-type annotation cell_type_col. You can specify the subsetting scenario scenario as one of “mirror” or “even”. The mirror scenario keeps the same cell-type proportions as in the input data but reduces the overall cell number. The even scenario selects the same number of cells for each cell-type. Specify the number of cells you want after subsetting using the ncells parameter. In case notEnoughcells are available for a cell-type to match the required number according to the scenario you can set this parameter to “asis” to keep all remaining cells or “remove” the cell-type completely.

subset <- subsetSCE(sce, cell_type_col = "celltype_major", scenario = "mirror", ncells = 500)
## ── spacedeconv ─────────────────────────────────────────────────────────────────
## ℹ testing parameter
## ℹ Set seed to 12345
## ℹ testing parameter✔ parameter OK [10ms]
## 
## ℹ extracting up to 500 cells
## ✔ extracting up to 500 cells [60ms]
## 
## ℹ extracted 501 cells
## ✔ extracted 501 cells [12ms]

10. `subsetSPE`

When not working with the full Spatial data you can opt for using a subset of the spatial slide. This function take a colRange and rowRange vector containing the pixel coordinates of the fullres image to cut subsets of the provided spatial data.

subset <- subsetSPE(spe, colRange = c(0, 1000), rowRange = c(0, 1000))

1. preprocess

2. normalize

3. print_info

4. available_results

5. aggregate_results

6. addCustomAnnotation

7. annotate_spots

8. scale_cell_counts

9. subsetSCE

10. subsetSPE