Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, UK.
Theory of Condensed Matter Group, The Cavendish Laboratory, University of Cambridge, Cambridge, UK.
Nat Genet. 2023 Nov;55(11):1998-2008. doi: 10.1038/s41588-023-01523-7. Epub 2023 Oct 12.
Joint analysis of single-cell genomics data from diseased tissues and a healthy reference can reveal altered cell states. We investigate whether integrated collections of data from healthy individuals (cell atlases) are suitable references for disease-state identification and whether matched control samples are needed to minimize false discoveries. We demonstrate that using a reference atlas for latent space learning followed by differential analysis against matched controls leads to improved identification of disease-associated cells, especially with multiple perturbed cell types. Additionally, when an atlas is available, reducing control sample numbers does not increase false discovery rates. Jointly analyzing data from a COVID-19 cohort and a blood cell atlas, we improve detection of infection-related cell states linked to distinct clinical severities. Similarly, we studied disease states in pulmonary fibrosis using a healthy lung atlas, characterizing two distinct aberrant basal states. Our analysis provides guidelines for designing disease cohort studies and optimizing cell atlas use.
对来自病变组织和健康参考的单细胞基因组学数据进行联合分析,可以揭示改变的细胞状态。我们研究了健康个体的综合数据集(细胞图谱)是否适合用于疾病状态识别,以及是否需要匹配的对照样本以最大程度地减少假阳性发现。我们证明,使用参考图谱进行潜在空间学习,然后针对匹配的对照进行差异分析,可以提高对疾病相关细胞的识别能力,尤其是在存在多种受干扰的细胞类型时。此外,当有图谱可用时,减少对照样本数量不会增加假阳性发现率。通过联合分析 COVID-19 队列和血细胞图谱的数据,我们改进了对与不同临床严重程度相关的感染相关细胞状态的检测。同样,我们使用健康肺图谱研究了肺纤维化疾病状态,描述了两种不同的异常基底状态。我们的分析为设计疾病队列研究和优化细胞图谱使用提供了指导。