用于参考图谱增量增长的单细胞转录组匹配方法的基准测试

Benchmarking single cell transcriptome matching methods for incremental growth of reference atlases.

作者信息

Hu Joyce, Peng Beverly, Pankajam Ajith V, Xu Bingfang, Deshpande Vikrant Anil, Bueckle Andreas, Herr Bruce W, Börner Katy, Dupont Christopher, Scheuermann Richard H, Zhang Yun

出版信息

bioRxiv. 2025 Apr 16:2025.04.10.648034. doi: 10.1101/2025.04.10.648034.

DOI:10.1101/2025.04.10.648034

PMID:40568082

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12190912/

Abstract

BACKGROUND

The advancement of single cell technologies has driven significant progress in constructing a multiscale, pan-organ Human Reference Atlas (HRA) for healthy human cells, though challenges remain in harmonizing cell types and unifying nomenclature. Multiple machine learning and artificial intelligence methods, including pre-trained and fine-tuned models on large-scale atlas data, are publicly available for the single cell community users to computationally annotate and match their cell clusters to the reference atlas.

RESULTS

This study benchmarks four computational tools for cell type annotation and matching - Azimuth, CellTypist, scArches, and FR-Match - using two lung atlas datasets, the Human Lung Cell Atlas (HLCA) and the LungMAP single-cell reference (CellRef). Despite achieving high overall performance while comparing algorithmic cell type annotations to expert annotated data, variations in accuracy were observed, especially in annotating rare cell types, underlining the need for improved consistency across cell type prediction methods. The benchmarked methods were used to cross-compare and incrementally integrate 61 cell types from HLCA and 48 cell types from CellRef, resulting in a meta-atlas of 41 matched cell types, 20 HLCA-specific cell types, and 7 CellRef-specific cell types.

CONCLUSION

This study reveals complementing strengths of the benchmarked methods and presents a framework for incremental growth of the cell type inventory in the reference atlases, leading to 68 unique cell types in the meta-atlas across CellRef and HLCA. The benchmarking analysis contributes to improving the coverage and quality of HRA construction by assessing the reliability and performance of cell type annotation approaches for single cell transcriptomics datasets.

摘要

背景

单细胞技术的进步推动了构建健康人类细胞的多尺度全器官人类参考图谱（HRA）取得重大进展，尽管在协调细胞类型和统一命名方面仍存在挑战。多种机器学习和人工智能方法，包括在大规模图谱数据上进行预训练和微调的模型，已向单细胞社区用户公开，以便通过计算对其细胞簇进行注释并与参考图谱进行匹配。

结果

本研究使用两个人类肺图谱数据集，即人类肺细胞图谱（HLCA）和肺MAP单细胞参考图谱（CellRef），对四种用于细胞类型注释和匹配的计算工具——方位角（Azimuth）、细胞类型分类器（CellTypist）、单细胞架构搜索（scArches）和FR匹配（FR-Match）进行了基准测试。尽管在将算法细胞类型注释与专家注释数据进行比较时整体性能较高，但仍观察到准确性存在差异，尤其是在注释稀有细胞类型时，这突出表明需要提高细胞类型预测方法之间的一致性。使用基准测试方法对HLCA的61种细胞类型和CellRef的48种细胞类型进行交叉比较和逐步整合，得到了一个包含41种匹配细胞类型、20种HLCA特有的细胞类型和7种CellRef特有的细胞类型的元图谱。

结论

本研究揭示了基准测试方法的互补优势，并提出了一个参考图谱中细胞类型清单增量增长的框架，从而在跨越CellRef和HLCA的元图谱中产生了68种独特的细胞类型。基准测试分析通过评估单细胞转录组学数据集的细胞类型注释方法的可靠性和性能，有助于提高HRA构建的覆盖范围和质量。

相似文献

Benchmarking single cell transcriptome matching methods for incremental growth of reference atlases.

bioRxiv. 2025 Apr 16:2025.04.10.648034. doi: 10.1101/2025.04.10.648034.

A rapid and systematic review of the clinical effectiveness and cost-effectiveness of paclitaxel, docetaxel, gemcitabine and vinorelbine in non-small-cell lung cancer.

Health Technol Assess. 2001;5(32):1-195. doi: 10.3310/hta5320.

Systemic treatments for metastatic cutaneous melanoma.

Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.

Eliciting adverse effects data from participants in clinical trials.

Cochrane Database Syst Rev. 2018 Jan 16;1(1):MR000039. doi: 10.1002/14651858.MR000039.pub2.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Screening for aspiration risk associated with dysphagia in acute stroke.

Cochrane Database Syst Rev. 2021 Oct 18;10(10):CD012679. doi: 10.1002/14651858.CD012679.pub2.

Drugs for preventing postoperative nausea and vomiting in adults after general anaesthesia: a network meta-analysis.

Cochrane Database Syst Rev. 2020 Oct 19;10(10):CD012859. doi: 10.1002/14651858.CD012859.pub2.

PDF Entity Annotation Tool (PEAT).

J Open Source Softw. 2025 Apr 8;10(108):5336. doi: 10.21105/joss.05336.

Interventions for central serous chorioretinopathy: a network meta-analysis.

Cochrane Database Syst Rev. 2025 Jun 16;6(6):CD011841. doi: 10.1002/14651858.CD011841.pub3.

Rapid molecular tests for tuberculosis and tuberculosis drug resistance: a qualitative evidence synthesis of recipient and provider views.

Cochrane Database Syst Rev. 2022 Apr 26;4(4):CD014877. doi: 10.1002/14651858.CD014877.pub2.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于参考图谱增量增长的单细胞转录组匹配方法的基准测试

Benchmarking single cell transcriptome matching methods for incremental growth of reference atlases.

作者信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献