R/bmind_func.r
get_prior.Rd
It calculates prior CTS profile and covariance matrix from single-cell data. The output can serve as hyper-parameters in bMIND. Only genes with positive definite covariance matrix are outputted.
get_prior(sc, sample = NULL, cell_type = NULL, meta_sc = NULL, filter_pd = T)
single-cell count matrix, gene x cell.
variable for sample ID. It is used to generate pseudo-bulk data and calculate cell-type covariance matrix across samples for each gene.
variable for cell type labels.
data.frame for meta of cells (cell x features, including columns `sample` (sample ID), `cell_type`). Not required if sample and cell_type are provided.
option to filer out genes without positive-definite cell-type covariance matrix, default is TRUE.
A list containing
CTS profile matrix (gene x cell type), in log2(CPM + 1) scale.
CTS covariance matrix (gene x cell type x cell type).
suppressMessages(library(scRNAseq))
suppressMessages(dar <- DarmanisBrainData())
# focus on adult cells with non-hybrid lables
dar = dar[, !dar$cell.type %in% c('fetal_quiescent', 'fetal_replicating', 'hybrid')]
dar = dar[rowSums(counts(dar))!=0,]
table(dar$cell.type, dar$experiment_sample_name)
#>
#> AB_S1 AB_S11 AB_S2 AB_S3 AB_S4 AB_S5 AB_S7 AB_S8
#> OPC 0 0 0 0 0 0 0 18
#> astrocytes 11 7 2 0 38 0 3 1
#> endothelial 2 1 0 1 0 16 0 0
#> microglia 5 1 2 0 0 1 0 7
#> neurons 6 50 1 2 19 18 35 0
#> oligodendrocytes 0 1 0 0 1 9 12 15
# the error is caused by OPC is only from one sample
prior = get_prior(sc = counts(dar), sample = dar$experiment_sample_name, cell_type = dar$cell.type)
#> Error in if (!is.symmetric.matrix(x)) stop("argument x is not a symmetric matrix"): missing value where TRUE/FALSE needed
# drop cells only from one sample
dar = dar[, !dar$cell.type %in% c('OPC')]
dar = dar[rowSums(counts(dar))!=0,]
prior = get_prior(sc = counts(dar), sample = dar$experiment_sample_name, cell_type = dar$cell.type)
#> [1] "20976 genes are filtered out because cell-type covariance matrix is not positive-definite (PD);"
#> [1] "The filtering can be disabled by setting filter_pd = FALSE. Note the prior cell-type covariance matrix for each gene is required to be PD."
dim(prior$profile)
#> [1] 328 5
head(prior$profile)
#> astrocytes endothelial microglia neurons oligodendrocytes
#> ABL1 3.9444509 6.685126 1.425422 4.350492 5.68671293
#> ACBD5 6.5678615 3.946924 4.355075 5.738703 6.81941440
#> ADARB1 4.3358691 5.254009 1.505483 4.783873 0.05764407
#> AFTPH 5.1665212 7.679581 7.632321 7.114015 4.65149718
#> AGPAT4 0.5202005 2.990566 1.051626 6.240137 6.80423368
#> AGT 10.0188773 6.699810 2.403666 4.833541 4.00746237
dim(prior$covariance)
#> [1] 328 5 5
prior$covariance[1,,]
#> astrocytes endothelial microglia neurons oligodendrocytes
#> astrocytes 5.027003 -2.017456 -1.0778568 -2.120282 -2.0876742
#> endothelial -2.017456 19.984533 0.0000000 7.774008 -10.2333931
#> microglia -1.077857 0.000000 0.9088396 0.000000 -0.2422067
#> neurons -2.120282 7.774008 0.0000000 5.120625 -3.3872491
#> oligodendrocytes -2.087674 -10.233393 -0.2422067 -3.387249 10.5114827
# Note bMIND requires positive-definite prior covariance matrix
prior_all = get_prior(sc = counts(dar), sample = dar$experiment_sample_name, cell_type = dar$cell.type, filter_pd = F)