Department of Biostatistics, Yale University, New Haven, CT, USA.
Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, USA.
Nat Commun. 2023 Aug 10;14(1):4846. doi: 10.1038/s41467-023-40503-7.
The advancement of single cell RNA-sequencing (scRNA-seq) technology has enabled the direct inference of co-expressions in specific cell types, facilitating our understanding of cell-type-specific biological functions. For this task, the high sequencing depth variations and measurement errors in scRNA-seq data present two significant challenges, and they have not been adequately addressed by existing methods. We propose a statistical approach, CS-CORE, for estimating and testing cell-type-specific co-expressions, that explicitly models sequencing depth variations and measurement errors in scRNA-seq data. Systematic evaluations show that most existing methods suffered from inflated false positives as well as biased co-expression estimates and clustering analysis, whereas CS-CORE gave accurate estimates in these experiments. When applied to scRNA-seq data from postmortem brain samples from Alzheimer's disease patients/controls and blood samples from COVID-19 patients/controls, CS-CORE identified cell-type-specific co-expressions and differential co-expressions that were more reproducible and/or more enriched for relevant biological pathways than those inferred from existing methods.
单细胞 RNA 测序(scRNA-seq)技术的进步使得我们能够直接推断特定细胞类型中的共表达情况,从而促进了我们对细胞类型特异性生物学功能的理解。对于这项任务,scRNA-seq 数据中的高测序深度变化和测量误差带来了两个重大挑战,而现有方法并没有充分解决这些挑战。我们提出了一种统计方法 CS-CORE,用于估计和检验细胞类型特异性的共表达情况,该方法明确地对 scRNA-seq 数据中的测序深度变化和测量误差进行建模。系统评估表明,大多数现有方法存在假阳性率过高以及共表达估计和聚类分析存在偏差的问题,而 CS-CORE 在这些实验中给出了准确的估计。当应用于来自阿尔茨海默病患者/对照者的死后脑组织样本和来自 COVID-19 患者/对照者的血液样本的 scRNA-seq 数据时,CS-CORE 鉴定出的细胞类型特异性共表达和差异共表达比从现有方法推断出的共表达更具可重复性和/或更富集相关生物学途径。