Generates a reference profile based on single-cell data. Learns a transformation of bulk expression based on observed single-cell proportions and performs NNLS regression on these transformed values to estimate cell proportions.

deconvolute_bisque(
  bulk_gene_expression,
  single_cell_object,
  cell_type_annotations,
  batch_ids,
  markers = NULL,
  cell_types = "cellType",
  subject_names = "batchId",
  use_overlap = FALSE,
  verbose = FALSE,
  old_cpm = TRUE
)

Arguments

bulk_gene_expression

A matrix of bulk data. Rows are genes, columns are samples. Row and column names need to be set.

single_cell_object

A matrix with the single-cell data. Rows are genes, columns are samples. Row and column names need to be set.

cell_type_annotations

A vector of the cell type annotations. Has to be in the same order as the samples in single_cell_object.

batch_ids

A vector of the ids of the samples or individuals.

markers

Structure, such as character vector, containing marker genes to be used in decomposition. unique(unlist(markers)) should return a simple vector containing each gene name. If no argument or NULL provided, the method will use all available genes for decomposition.

cell_types

Character string. Name of phenoData attribute in sc.eset indicating cell type label for each cell.

subject_names

Character string. Name of phenoData attribute in sc.eset indicating individual label for each cell.

use_overlap

Boolean. Whether to use and expect overlapping samples in decomposition.

verbose

Whether to produce an output on the console.

old_cpm

Prior to version 1.0.4 (updated in July 2020), the package converted counts to CPM after subsetting the marker genes. Github user randel pointed out that the order of these operations should be switched. Thanks randel! This option is provided for replication of older BisqueRNA but should be enabled, especially for small marker gene sets. We briefly tested this change on the cortex and adipose datasets. The original and new order of operations produce estimates that have an average correlation of 0.87 for the cortex and 0.84 for the adipose within each cell type.

Value

A list including:

bulk_props

A matrix of cell type proportion estimates with cell types as rows and individuals as columns.

sc_props

A matrix of cell type proportions estimated directly from counting single-cell data.

rnorm

Euclidean norm of the residuals for each individual's proportion estimates.

genes_used

A vector of genes used in decomposition.

transformed_bulk

The transformed bulk expression used for decomposition. These values are generated by applying a linear transformation to the CPM expression.