Make a random expression set from a single-cell dataset

make_bulk_eset(eset, cell_fractions, n_cells = 500, combine = mean)

Arguments

eset: Biobase::ExpressionSet with a cell_type column in pData.
cell_fractions: n x n_cell_types dataframe with the fraction for each sample in each row.
n_cells: number of single cells to use in each sample
combine: callback function used to aggregate the counts.

Value

matrix with genes in rows and samples in columns. The column sum is scaled to 1 million (TPM).

Examples

suppressPackageStartupMessages(library(Biobase))
suppressPackageStartupMessages(library(tibble))

# generate toy matrix with three marker genes and three cell types
expr <- matrix(c(
  rep(c(1, 0, 0), 300),
  rep(c(0, 1, 0), 300),
  rep(c(0, 0, 1), 300)
), nrow = 3)

# generate a featureData and phenoData data-frame.
# row names must be consistent between expr and featureData.
gene_names <- c("CD8A", "CD4", "CD19")
rownames(expr) <- gene_names
cell_types <- c(rep("T cell CD8+", 300), rep("T cell CD4+", 300), rep("B cell", 300))
pdata <- data.frame(cell_type = cell_types)
fdata <- data.frame(gene_symbol = gene_names)
rownames(fdata) <- gene_names

# tie expr, fdata and pdata together in expression set
eset <- ExpressionSet(expr,
  phenoData = as(pdata, "AnnotatedDataFrame"),
  featureData = as(fdata, "AnnotatedDataFrame")
)

# create data frame with cell fractions (one row per sample)
desired_cell_fractions <- tibble(
  "T cell CD8+" = c(0.1, 0.2, 0.3),
  "T cell CD4+" = c(0.9, 0.7, 0.5),
  "B cell" = c(0, 0.1, 0.2)
)
new_eset <- make_bulk_eset(eset, desired_cell_fractions, n_cells = 500)
#> New names:
#> • `value` -> `value...1`
#> • `value` -> `value...2`
#> • `value` -> `value...3`
exprs(new_eset)
#>      value...1 value...2 value...3
#> CD8A     1e+05     2e+05     3e+05
#> CD4      9e+05     7e+05     5e+05
#> CD19     0e+00     1e+05     2e+05

Make a random expression set from a single-cell dataset

Arguments

Value

See also

Examples