Suppr超能文献

使用GCTHarmony在单细胞研究中基于大语言模型的细胞类型注释协调

LLM-based cell type annotation harmonization across single-cell studies using GCTHarmony.

作者信息

Zhang Xingyuan, Ji Zhicheng

机构信息

Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA.

Computational Biology and Bioinformatics Program, Duke University School of Medicine, Durham, NC, USA.

出版信息

Res Sq. 2025 Aug 12:rs.3.rs-7151095. doi: 10.21203/rs.3.rs-7151095/v1.

Abstract

A major challenge in integrating previously analyzed single-cell RNA-seq studies is the inconsistency of cell type annotations. To address this, we developed GCTHarmony, an LLM-based method for harmonizing cell type annotations across single-cell studies. Utilizing OpenAI's text embedding model, GCTHarmony accurately maps arbitrary cell type annotations to standardized cell ontology terms and reconciles discrepancies in annotation hierarchies across studies. In a real data example, we show that GCTHarmony substantially improves the consistency of cell type annotations across single-cell studies.

摘要

整合先前分析的单细胞RNA测序研究的一个主要挑战是细胞类型注释的不一致性。为了解决这个问题,我们开发了GCTHarmony,这是一种基于大语言模型的方法,用于协调单细胞研究中的细胞类型注释。利用OpenAI的文本嵌入模型,GCTHarmony可以将任意细胞类型注释准确地映射到标准化的细胞本体术语,并协调各研究中注释层次结构的差异。在一个实际数据示例中,我们表明GCTHarmony显著提高了单细胞研究中细胞类型注释的一致性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30df/12363920/2bfe68f20964/nihpp-rs7151095v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验