Build SummarizedExperiment using multiple sfaira entries
Source:R/dataset.R
dataset_sfaira_multiple.Rd
You can apply different filters on the whole data-zoo of sfaria; the resulting single-cell datasets will be combined into a single dataset which you can use for simulation Note: only datasets in sfaira with annotation are considered!
Usage
dataset_sfaira_multiple(
organisms = NULL,
tissues = NULL,
assays = NULL,
sfaira_setup,
name = "SimBu_dataset",
spike_in_col = NULL,
additional_cols = NULL,
filter_genes = TRUE,
variance_cutoff = 0,
type_abundance_cutoff = 0,
scale_tpm = TRUE
)
Arguments
- organisms
(mandatory) list of organisms (only human and mouse available)
- tissues
(mandatory) list of tissues
- assays
(mandatory) list of assays
- sfaira_setup
(mandatory) the sfaira setup; given by
setup_sfaira
- name
name of the dataset; will be used for new unique IDs of cells
- spike_in_col
which column in annotation contains information on spike_in counts, which can be used to re-scale counts
- additional_cols
list of column names in annotation, that should be stored as well in dataset object
- filter_genes
boolean, if TRUE, removes all genes with 0 expression over all samples & genes with variance below
variance_cutoff
- variance_cutoff
numeric, is only applied if
filter_genes
is TRUE: removes all genes with variance below the chosen cutoff- type_abundance_cutoff
numeric, remove all cells, whose cell-type appears less then the given value. This removes low abundant cell-types
- scale_tpm
boolean, if TRUE (default) the cells in tpm_matrix will be scaled to sum up to 1e6
Examples
# \donttest{
setup_list <- SimBu::setup_sfaira(tempdir())
#> AttributeError: module 'pandas.arrays' has no attribute 'ArrowStringArray'
ds_human_lung <- SimBu::dataset_sfaira_multiple(
sfaira_setup = setup_list,
organisms = "Homo sapiens",
tissues = "lung parenchyma",
assay = "10x 3' v2",
name = "human_lung"
)
#> Warning: You need to setup sfaira first; please use setup_sfaira() to do so.
# }