Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, USA.
Department of Statistics, Columbia University, New York, NY, USA.
Nat Biotechnol. 2024 Jul;42(7):1084-1095. doi: 10.1038/s41587-023-01940-3. Epub 2023 Sep 21.
Factor analysis decomposes single-cell gene expression data into a minimal set of gene programs that correspond to processes executed by cells in a sample. However, matrix factorization methods are prone to technical artifacts and poor factor interpretability. We address these concerns with Spectra, an algorithm that combines user-provided gene programs with the detection of novel programs that together best explain expression covariation. Spectra incorporates existing gene sets and cell-type labels as prior biological information, explicitly models cell type and represents input gene sets as a gene-gene knowledge graph using a penalty function to guide factorization toward the input graph. We show that Spectra outperforms existing approaches in challenging tumor immune contexts, as it finds factors that change under immune checkpoint therapy, disentangles the highly correlated features of CD8 T cell tumor reactivity and exhaustion, finds a program that explains continuous macrophage state changes under therapy and identifies cell-type-specific immune metabolic programs.
因子分析将单细胞基因表达数据分解为一组最小的基因程序,这些程序对应于样本中细胞执行的过程。然而,矩阵分解方法容易受到技术伪影和较差的因子可解释性的影响。我们使用 Spectra 解决了这些问题,这是一种算法,它将用户提供的基因程序与新程序的检测相结合,这些程序共同最好地解释了表达的协变。Spectra 将现有的基因集和细胞类型标签作为先验生物学信息纳入其中,显式地对细胞类型进行建模,并使用惩罚函数将输入基因集表示为基因-基因知识图谱,以引导因子分解到输入图谱。我们表明,Spectra 在具有挑战性的肿瘤免疫环境中优于现有方法,因为它找到了在免疫检查点治疗下发生变化的因子,分离了 CD8 T 细胞肿瘤反应和衰竭的高度相关特征,找到了一个可以解释治疗下巨噬细胞状态连续变化的程序,并确定了细胞类型特异性免疫代谢程序。