Hou Wenpin, Ji Zhicheng
Department of Biostatistics, Columbia University Mailman School of Public Health, New York City, NY, USA.
Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC, USA.
Nat Methods. 2024 Aug;21(8):1462-1465. doi: 10.1038/s41592-024-02235-4. Epub 2024 Mar 25.
Here we demonstrate that the large language model GPT-4 can accurately annotate cell types using marker gene information in single-cell RNA sequencing analysis. When evaluated across hundreds of tissue and cell types, GPT-4 generates cell type annotations exhibiting strong concordance with manual annotations. This capability can considerably reduce the effort and expertise required for cell type annotation. Additionally, we have developed an R software package GPTCelltype for GPT-4's automated cell type annotation.
在这里,我们证明了大语言模型GPT-4能够在单细胞RNA测序分析中使用标记基因信息准确注释细胞类型。当在数百种组织和细胞类型中进行评估时,GPT-4生成的细胞类型注释与人工注释表现出高度一致性。这种能力可以大大减少细胞类型注释所需的工作量和专业知识。此外,我们还为GPT-4的自动细胞类型注释开发了一个R软件包GPTCelltype。