Suppr超能文献

使用大语言模型增强临床记录中的表型识别:PhenoBCBERT和PhenoGPT

Enhancing Phenotype Recognition in Clinical Notes Using Large Language Models: PhenoBCBERT and PhenoGPT.

作者信息

Yang Jingye, Liu Cong, Deng Wendy, Wu Da, Weng Chunhua, Zhou Yunyun, Wang Kai

机构信息

Raymond G. Perelman Center for Cellular and Molecular Therapeutics, Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA.

Department of Mathematics, University of Pennsylvania, Philadelphia, PA 19104, USA.

出版信息

ArXiv. 2023 Nov 9:arXiv:2308.06294v2.

Abstract

To enhance phenotype recognition in clinical notes of genetic diseases, we developed two models - PhenoBCBERT and PhenoGPT - for expanding the vocabularies of Human Phenotype Ontology (HPO) terms. While HPO offers a standardized vocabulary for phenotypes, existing tools often fail to capture the full scope of phenotypes, due to limitations from traditional heuristic or rule-based approaches. Our models leverage large language models (LLMs) to automate the detection of phenotype terms, including those not in the current HPO. We compared these models to PhenoTagger, another HPO recognition tool, and found that our models identify a wider range of phenotype concepts, including previously uncharacterized ones. Our models also showed strong performance in case studies on biomedical literature. We evaluated the strengths and weaknesses of BERT-based and GPT-based models in aspects such as architecture and accuracy. Overall, our models enhance automated phenotype detection from clinical texts, improving downstream analyses on human diseases.

摘要

为了增强遗传疾病临床记录中的表型识别能力,我们开发了两种模型——PhenoBCBERT和PhenoGPT,用于扩展人类表型本体(HPO)术语的词汇表。虽然HPO为表型提供了标准化词汇表,但由于传统启发式或基于规则的方法存在局限性,现有工具往往无法涵盖表型的全部范围。我们的模型利用大语言模型(LLM)自动检测表型术语,包括当前HPO中未有的术语。我们将这些模型与另一种HPO识别工具PhenoTagger进行了比较,发现我们的模型能够识别更广泛的表型概念,包括以前未表征的概念。我们的模型在生物医学文献的案例研究中也表现出强大的性能。我们在架构和准确性等方面评估了基于BERT和基于GPT的模型的优缺点。总体而言,我们的模型增强了从临床文本中自动检测表型的能力,改善了对人类疾病的下游分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5ee0/10659449/9cb066155896/nihpp-2308.06294v2-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验