Suppr超能文献

基因智能体:使用领域数据库进行基因集分析的自我验证语言智能体。

GeneAgent: self-verification language agent for gene-set analysis using domain databases.

作者信息

Wang Zhizheng, Jin Qiao, Wei Chih-Hsuan, Tian Shubo, Lai Po-Ting, Zhu Qingqing, Day Chi-Ping, Ross Christina, Leaman Robert, Lu Zhiyong

机构信息

Division of Intramural Research, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.

Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.

出版信息

Nat Methods. 2025 Jul 28. doi: 10.1038/s41592-025-02748-6.

Abstract

Gene-set analysis seeks to identify the biological mechanisms underlying groups of genes with shared functions. Large language models (LLMs) have recently shown promise in generating functional descriptions for input gene sets but may produce factually incorrect statements, commonly referred to as hallucinations in LLMs. Here we present GeneAgent, an LLM-based AI agent for gene-set analysis that reduces hallucinations by autonomously interacting with biological databases to verify its own output. Evaluation of 1,106 gene sets collected from different sources demonstrates that GeneAgent is consistently more accurate than GPT-4 by a significant margin. We further applied GeneAgent to seven novel gene sets derived from mouse B2905 melanoma cell lines. Expert review confirmed that GeneAgent produces more relevant and comprehensive functional descriptions than GPT-4, providing valuable insights into gene functions and expediting knowledge discovery.

摘要

基因集分析旨在识别具有共享功能的基因群体背后的生物学机制。大型语言模型(LLMs)最近在为输入基因集生成功能描述方面显示出了前景,但可能会产生事实上错误的陈述,这在大型语言模型中通常被称为幻觉。在此,我们展示了GeneAgent,一种基于大型语言模型的用于基因集分析的人工智能代理,它通过与生物数据库自主交互来验证自身输出,从而减少幻觉。对从不同来源收集的1106个基因集的评估表明,GeneAgent始终比GPT-4准确得多。我们进一步将GeneAgent应用于从小鼠B2905黑色素瘤细胞系衍生的七个新基因集。专家评审证实,GeneAgent比GPT-4产生更相关和全面的功能描述,为基因功能提供了有价值的见解,并加快了知识发现。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验