Suppr超能文献

scDOT:通过多参考整合增强单细胞RNA测序数据注释并揭示新型细胞类型

scDOT: enhancing single-cell RNA-Seq data annotation and uncovering novel cell types through multi-reference integration.

作者信息

Xiong Yi-Xuan, Zhang Xiao-Fei

机构信息

School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China.

Key Laboratory of Nonlinear Analysis & Applications (Ministry of Education), Central China Normal University, Wuhan 430079, China.

出版信息

Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae072.

Abstract

The proliferation of single-cell RNA-seq data has greatly enhanced our ability to comprehend the intricate nature of diverse tissues. However, accurately annotating cell types in such data, especially when handling multiple reference datasets and identifying novel cell types, remains a significant challenge. To address these issues, we introduce Single Cell annotation based on Distance metric learning and Optimal Transport (scDOT), an innovative cell-type annotation method adept at integrating multiple reference datasets and uncovering previously unseen cell types. scDOT introduces two key innovations. First, by incorporating distance metric learning and optimal transport, it presents a novel optimization framework. This framework effectively learns the predictive power of each reference dataset for new query data and simultaneously establishes a probabilistic mapping between cells in the query data and reference-defined cell types. Secondly, scDOT develops an interpretable scoring system based on the acquired probabilistic mapping, enabling the precise identification of previously unseen cell types within the data. To rigorously assess scDOT's capabilities, we systematically evaluate its performance using two diverse collections of benchmark datasets encompassing various tissues, sequencing technologies and diverse cell types. Our experimental results consistently affirm the superior performance of scDOT in cell-type annotation and the identification of previously unseen cell types. These advancements provide researchers with a potent tool for precise cell-type annotation, ultimately enriching our understanding of complex biological tissues.

摘要

单细胞RNA测序数据的激增极大地提高了我们理解不同组织复杂本质的能力。然而,在此类数据中准确注释细胞类型,尤其是在处理多个参考数据集和识别新细胞类型时,仍然是一项重大挑战。为了解决这些问题,我们引入了基于距离度量学习和最优传输的单细胞注释(scDOT),这是一种创新的细胞类型注释方法,擅长整合多个参考数据集并发现以前未见过的细胞类型。scDOT引入了两项关键创新。首先,通过结合距离度量学习和最优传输,它提出了一个新颖的优化框架。该框架有效地学习每个参考数据集对新查询数据的预测能力,并同时在查询数据中的细胞与参考定义的细胞类型之间建立概率映射。其次,scDOT基于获得的概率映射开发了一个可解释的评分系统,能够在数据中精确识别以前未见过的细胞类型。为了严格评估scDOT的能力,我们使用包含各种组织、测序技术和不同细胞类型的两个不同的基准数据集集合系统地评估其性能。我们的实验结果一致肯定了scDOT在细胞类型注释和识别以前未见过的细胞类型方面的卓越性能。这些进展为研究人员提供了一个强大的工具,用于精确的细胞类型注释,最终丰富我们对复杂生物组织的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3372/10939303/4c187a055c09/bbae072f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验