Suppr超能文献

通过使用检索增强生成的大语言模型改进自动深度表型分析

Improving Automated Deep Phenotyping Through Large Language Models Using Retrieval Augmented Generation.

作者信息

Garcia Brandon T, Westerfield Lauren, Yelemali Priya, Gogate Nikhita, Andres Rivera-Munoz E, Du Haowei, Dawood Moez, Jolly Angad, Lupski James R, Posey Jennifer E

机构信息

Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX, 77030, USA.

Medical Scientist Training Program, Baylor College of Medicine, Houston, TX, 77030, USA.

出版信息

medRxiv. 2024 Dec 2:2024.12.01.24318253. doi: 10.1101/2024.12.01.24318253.

Abstract

BACKGROUND

Diagnosing rare genetic disorders relies on precise phenotypic and genotypic analysis, with the Human Phenotype Ontology (HPO) providing a standardized language for capturing clinical phenotypes. Traditional HPO tools, such as Doc2HPO and ClinPhen, employ concept recognition to automate phenotype extraction but struggle with incomplete phenotype assignment, often requiring intensive manual review. While large language models (LLMs) hold promise for more context-driven phenotype extraction, they are prone to errors and "hallucinations," making them less reliable without further refinement. We present RAG-HPO, a Python-based tool that leverages Retrieval-Augmented Generation (RAG) to elevate LLM accuracy in HPO term assignment, bypassing the limitations of baseline models while avoiding the time and resource intensive process of fine-tuning. RAG-HPO integrates a dynamic vector database, allowing real-time retrieval and contextual matching.

METHODS

The high-dimensional vector database utilized by RAG-HPO includes >54,000 phenotypic phrases mapped to HPO IDs, derived from the HPO database and supplemented with additional validated phrases. The RAG-HPO workflow uses an LLM to first extract phenotypic phrases that are then matched via semantic similarity to entries within a vector database before providing best term matches back to the LLM as context for final HPO term assignment. A benchmarking dataset of 120 published case reports with 1,792 manually-assigned HPO terms was developed, and the performance of RAG-HPO measured against existing published tools Doc2HPO, ClinPhen, and FastHPOCR.

RESULTS

In evaluations, RAG-HPO, powered by Llama-3 70B and applied to a set of 120 case reports, achieved a mean precision of 0.84, recall of 0.78, and an F1 score of 0.80-significantly surpassing conventional tools (p<0.00001). False positive HPO term identification occurred for 15.8% (256/1,624) of terms, of which only 2.7% (7/256) represented hallucinations, and 33.6% (86/256) unrelated terms; the remainder of false positives (63.7%, 163/256) were relative terms of the target term.

CONCLUSIONS

RAG-HPO is a user-friendly, adaptable tool designed for secure evaluation of clinical text and outperforms standard HPO-matching tools in precision, recall, and F1. Its enhanced precision and recall represent a substantial advancement in phenotypic analysis, accelerating the identification of genetic mechanisms underlying rare diseases and driving progress in genetic research and clinical genomics.

摘要

背景

诊断罕见遗传病依赖于精确的表型和基因型分析,人类表型本体论(HPO)为描述临床表型提供了一种标准化语言。传统的HPO工具,如Doc2HPO和ClinPhen,采用概念识别来自动提取表型,但在表型分配不完整方面存在困难,通常需要大量人工审核。虽然大语言模型(LLM)有望实现更受上下文驱动的表型提取,但它们容易出错和产生“幻觉”,在没有进一步优化的情况下可靠性较低。我们展示了RAG-HPO,这是一种基于Python的工具,它利用检索增强生成(RAG)提高LLM在HPO术语分配中的准确性,绕过基线模型的局限性,同时避免耗时且资源密集的微调过程。RAG-HPO集成了一个动态向量数据库,允许实时检索和上下文匹配。

方法

RAG-HPO使用的高维向量数据库包含映射到HPO ID的超过54000个表型短语,这些短语源自HPO数据库,并补充了其他经过验证的短语。RAG-HPO工作流程首先使用LLM提取表型短语,然后通过语义相似性与向量数据库中的条目进行匹配,最后将最佳术语匹配结果作为上下文返回给LLM,用于最终的HPO术语分配。开发了一个包含120篇已发表病例报告和1792个手动分配的HPO术语的基准数据集,并将RAG-HPO的性能与现有的已发表工具Doc2HPO、ClinPhen和FastHPOCR进行比较。

结果

在评估中,由Llama-3 70B驱动并应用于一组120例病例报告的RAG-HPO,平均精确率为0.84,召回率为0.78,F1分数为0.80,显著超过传统工具(p<0.00001)。15.8%(256/1624)的术语出现了假阳性HPO术语识别,其中只有2.7%(7/256)是幻觉,33.6%(86/256)是不相关术语;其余的假阳性(63.7%,163/256)是目标术语的相关术语。

结论

RAG-HPO是一种用户友好、适应性强的工具,旨在安全评估临床文本,在精确率、召回率和F1方面优于标准的HPO匹配工具。其提高的精确率和召回率代表了表型分析的重大进展,加速了对罕见病潜在遗传机制的识别,并推动了基因研究和临床基因组学的进展。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5cf4/11643181/a14471e08c45/nihpp-2024.12.01.24318253v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验