• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大语言模型支持的临床自由文本摘要中智力障碍的识别:混合方法研究。

Large Language Model-Supported Identification of Intellectual Disabilities in Clinical Free-Text Summaries: Mixed Methods Study.

作者信息

Edwards Aleksandra, Pardiñas Antonio F, Kirov George, Rees Elliott, Camacho-Collados Jose

机构信息

School of Computer Science and Informatics, Cardiff University, Cathays, Cardiff, CF24 4AG, United Kingdom, 1 029 2087 4812.

School of Medicine, Centre for Neuropsychiatric Genetics and Genomics, Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, United Kingdom.

出版信息

JMIR AI. 2025 Sep 18;4:e72256. doi: 10.2196/72256.

DOI:10.2196/72256
PMID:40966546
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12445779/
Abstract

BACKGROUND

Free-text clinical data are unstructured and narrative in nature, providing a rich source of patient information, but extracting research-quality clinical phenotypes from these data remains a challenge. Manually reviewing and extracting clinical phenotypes from free-text patient notes is a time-consuming process and not suitable for large-scale datasets. On the other hand, automatically extracting clinical phenotypes can be challenging because medical researchers lack gold-standard annotated references and other purpose-built resources, including software. Recent large language models (LLMs) can understand natural language instructions, which help them adapt to different domains and tasks without the need for specific training data. This makes them suitable for clinical applications, though their use in this field is limited.

OBJECTIVE

We aimed to develop an LLM pipeline based on the few-shot learning framework that could extract clinical information from free-text clinical summaries. We assessed the performance of this pipeline for classifying individuals with confirmed or suspected comorbid intellectual disability (ID) from clinical summaries of patients with severe mental illness and performed genetic validation of the results by testing whether individuals with LLM-defined ID carried more genetic variants known to confer risk of ID when compared with individuals without LLM-defined ID.

METHODS

We developed novel approaches for performing classification, based on an intermediate information extraction (IE) step and human-in-the-loop techniques. We evaluated two models: Fine-Tuned Language Text-To-Text Transfer Transformer (Flan-T5) and Large Language Model Architecture (LLaMA). The dataset comprised 1144 free-text clinical summaries, of which 314 were manually annotated and used as a gold standard for evaluating automated methods. We also used published genetic data from 547 individuals to perform a genetic validation of the classification results; Firth's penalized logistic regression framework was used to test whether individuals with LLM-defined ID carry significantly more de novo variants in known developmental disorder risk genes than individuals without LLM-defined ID.

RESULTS

The results demonstrate that a 2-stage approach, combining IE with manual validation, can effectively identify individuals with suspected IDs from free-text patient records, requiring only a single training example per classification label. The best-performing method based on the Flan-T5 model and incorporating the IE step achieved an F1-score of 0.867. Individuals classified as having ID by the best performing model were significantly enriched for de novo variants in known developmental disorder risk genes (odds ratio 29.1, 95% CI 7.36-107; P=2.1×10-5).

CONCLUSIONS

LLMs and in-context learning techniques combined with human-in-the-loop approaches can be highly beneficial for extraction and categorization of information from free-text clinical data. In this proof-of-concept study, we show that LLMs can be used to identify individuals with a severe mental illness who also have suspected ID, which is a biologically and clinically meaningful subgroup of patients.

摘要

背景

自由文本临床数据本质上是非结构化的叙述性内容,提供了丰富的患者信息来源,但从这些数据中提取具有研究质量的临床表型仍然是一项挑战。人工审阅和从自由文本患者记录中提取临床表型是一个耗时的过程,不适用于大规模数据集。另一方面,自动提取临床表型可能具有挑战性,因为医学研究人员缺乏金标准注释参考文献和其他专门构建的资源,包括软件。最近的大语言模型(LLMs)能够理解自然语言指令,这有助于它们适应不同领域和任务,而无需特定的训练数据。这使其适用于临床应用,尽管其在该领域的使用有限。

目的

我们旨在开发一种基于少样本学习框架的大语言模型流程,该流程能够从自由文本临床摘要中提取临床信息。我们评估了该流程从严重精神疾病患者的临床摘要中对确诊或疑似合并智力残疾(ID)个体进行分类的性能,并通过测试与无大语言模型定义的ID的个体相比,有大语言模型定义的ID的个体是否携带更多已知会导致ID风险的遗传变异,对结果进行了基因验证。

方法

我们基于中间信息提取(IE)步骤和人工参与技术开发了用于进行分类的新方法。我们评估了两个模型:微调语言文本到文本转移变换器(Flan-T5)和大语言模型架构(LLaMA)。数据集包括1144篇自由文本临床摘要,其中314篇经过人工注释并用作评估自动化方法的金标准。我们还使用了来自547名个体的已发表基因数据对分类结果进行基因验证;使用Firth惩罚逻辑回归框架来测试与无大语言模型定义的ID的个体相比,有大语言模型定义的ID的个体在已知发育障碍风险基因中是否携带显著更多的新生变异。

结果

结果表明,将IE与人工验证相结合的两阶段方法可以有效地从自由文本患者记录中识别出疑似ID的个体,每个分类标签仅需一个训练示例。基于Flan-T5模型并纳入IE步骤的最佳性能方法的F1分数达到了0.867。被最佳性能模型分类为患有ID的个体在已知发育障碍风险基因中的新生变异显著富集(优势比29.1,95%置信区间7.36 - 107;P = 2.1×10⁻⁵)。

结论

大语言模型和上下文学习技术与人工参与方法相结合,对于从自由文本临床数据中提取和分类信息可能非常有益。在这项概念验证研究中,我们表明大语言模型可用于识别患有严重精神疾病且疑似患有ID的个体,这是一个在生物学和临床上具有意义的患者亚组。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d83e/12445779/183419f854fd/ai-v4-e72256-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d83e/12445779/95e6dc424967/ai-v4-e72256-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d83e/12445779/183419f854fd/ai-v4-e72256-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d83e/12445779/95e6dc424967/ai-v4-e72256-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d83e/12445779/183419f854fd/ai-v4-e72256-g002.jpg

相似文献

1
Large Language Model-Supported Identification of Intellectual Disabilities in Clinical Free-Text Summaries: Mixed Methods Study.大语言模型支持的临床自由文本摘要中智力障碍的识别:混合方法研究。
JMIR AI. 2025 Sep 18;4:e72256. doi: 10.2196/72256.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
4
Sexual Harassment and Prevention Training性骚扰与预防培训
5
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
6
Post-pandemic planning for maternity care for local, regional, and national maternity systems across the four nations: a mixed-methods study.针对四个地区的地方、区域和国家孕产妇保健系统的疫情后规划:一项混合方法研究。
Health Soc Care Deliv Res. 2025 Sep;13(35):1-25. doi: 10.3310/HHTE6611.
7
Large Language Models for Psychiatric Phenotype Extraction from Electronic Health Records.用于从电子健康记录中提取精神疾病表型的大语言模型
medRxiv. 2025 Aug 12:2025.08.07.25333172. doi: 10.1101/2025.08.07.25333172.
8
Detecting Stigmatizing Language in Clinical Notes with Large Language Models for Addiction Care.使用大语言模型在成瘾护理临床记录中检测污名化语言。
medRxiv. 2025 Aug 12:2025.08.08.25333315. doi: 10.1101/2025.08.08.25333315.
9
Short-Term Memory Impairment短期记忆障碍
10
Automated Extraction of Mortality Information From Publicly Available Sources Using Large Language Models: Development and Evaluation Study.使用大语言模型从公开可用来源自动提取死亡率信息:开发与评估研究
J Med Internet Res. 2025 Aug 18;27:e71113. doi: 10.2196/71113.

本文引用的文献

1
BioInstruct: instruction tuning of large language models for biomedical natural language processing.BioInstruct:用于生物医学自然语言处理的大型语言模型的指令调整。
J Am Med Inform Assoc. 2024 Sep 1;31(9):1821-1832. doi: 10.1093/jamia/ocae122.
2
Ethical and regulatory challenges of large language models in medicine.医学领域大型语言模型的伦理和监管挑战。
Lancet Digit Health. 2024 Jun;6(6):e428-e432. doi: 10.1016/S2589-7500(24)00061-X. Epub 2024 Apr 23.
3
Genomic findings in schizophrenia and their implications.精神分裂症的基因组研究结果及其意义。
Mol Psychiatry. 2023 Sep;28(9):3638-3647. doi: 10.1038/s41380-023-02293-8. Epub 2023 Oct 18.
4
ChatDoctor: A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI (LLaMA) Using Medical Domain Knowledge.ChatDoctor:一种基于医学领域知识对大型语言模型Meta-AI(LLaMA)进行微调的医学聊天模型。
Cureus. 2023 Jun 24;15(6):e40895. doi: 10.7759/cureus.40895. eCollection 2023 Jun.
5
AI in health: keeping the human in the loop.健康领域的人工智能:让人类参与其中。
J Am Med Inform Assoc. 2023 Jun 20;30(7):1225-1226. doi: 10.1093/jamia/ocad091.
6
Foundation models for generalist medical artificial intelligence.通用型医学人工智能的基础模型。
Nature. 2023 Apr;616(7956):259-265. doi: 10.1038/s41586-023-05881-4. Epub 2023 Apr 12.
7
Family-based analysis of the contribution of rare and common genetic variants to school performance in schizophrenia.基于家系的分析:罕见和常见遗传变异对精神分裂症患者学习成绩的影响。
Mol Psychiatry. 2023 May;28(5):2081-2087. doi: 10.1038/s41380-023-02013-2. Epub 2023 Mar 13.
8
BioGPT: generative pre-trained transformer for biomedical text generation and mining.BioGPT:用于生物医学文本生成和挖掘的生成式预训练转换器。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac409.
9
Automatically disambiguating medical acronyms with ontology-aware deep learning.基于本体感知深度学习的医学缩略语自动消歧
Nat Commun. 2021 Sep 7;12(1):5319. doi: 10.1038/s41467-021-25578-4.
10
AI Accelerated Human-in-the-loop Structuring of Radiology Reports.人工智能加速放射科报告的人机交互结构。
AMIA Annu Symp Proc. 2021 Jan 25;2020:1305-1314. eCollection 2020.