Pouyabahar Delaram, Andrews Tallulah, Bader Gary D
Department of Molecular Genetics, University of Toronto, Toronto, Ontario, Canada.
The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.
Res Sq. 2024 Aug 7:rs.3.rs-4819117. doi: 10.21203/rs.3.rs-4819117/v1.
Single-cell RNA sequencing (scRNA-seq) maps gene expression heterogeneity within a tissue. However, identifying biological signals in this data is challenging due to confounding technical factors, sparsity, and high dimensionality. Data factorization methods address this by separating and identifying signals in the data, such as gene expression programs, but the resulting factors must be manually interpreted. We developed Single-Cell Interpretable Residual Decomposition (sciRED) to improve the interpretation of scRNA-seq factor analysis. sciRED removes known confounding effects, uses rotations to improve factor interpretability, maps factors to known covariates, identifies unexplained factors that may capture hidden biological phenomena and determines the genes and biological processes represented by the resulting factors. We apply sciRED to multiple scRNA-seq datasets and identify sex-specific variation in a kidney map, discern strong and weak immune stimulation signals in a PBMC dataset, reduce ambient RNA contamination in a rat liver atlas to help identify strain variation, and reveal rare cell type signatures and anatomical zonation gene programs in a healthy human liver map. These demonstrate that sciRED is useful in characterizing diverse biological signals within scRNA-seq data.
单细胞RNA测序(scRNA-seq)可描绘组织内的基因表达异质性。然而,由于混杂的技术因素、稀疏性和高维度性,在这些数据中识别生物信号具有挑战性。数据分解方法通过分离和识别数据中的信号(如基因表达程序)来解决这一问题,但所得的因子必须人工解读。我们开发了单细胞可解释残差分解(sciRED)方法,以改进对scRNA-seq因子分析的解读。sciRED消除已知的混杂效应,利用旋转来提高因子的可解释性,将因子映射到已知的协变量,识别可能捕获隐藏生物现象的未解释因子,并确定所得因子所代表的基因和生物学过程。我们将sciRED应用于多个scRNA-seq数据集,在肾脏图谱中识别性别特异性变异,在PBMC数据集中辨别强免疫刺激信号和弱免疫刺激信号,在大鼠肝脏图谱中减少环境RNA污染以帮助识别品系变异,并在健康人类肝脏图谱中揭示罕见细胞类型特征和解剖分区基因程序。这些结果表明,sciRED在表征scRNA-seq数据中的多种生物信号方面很有用。