Suppr超能文献

大型语言模型在医疗保健数据增强中的应用:以患者-试验匹配为例。

Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching.

机构信息

Rice University, Houston, TX.

University of Texas Health Science Center, Houston, TX.

出版信息

AMIA Annu Symp Proc. 2024 Jan 11;2023:1324-1333. eCollection 2023.

Abstract

The process of matching patients with suitable clinical trials is essential for advancing medical research and providing optimal care. However, current approaches face challenges such as data standardization, ethical considerations, and a lack of interoperability between Electronic Health Records (EHRs) and clinical trial criteria. In this paper, we explore the potential of large language models (LLMs) to address these challenges by leveraging their advanced natural language generation capabilities to improve compatibility between EHRs and clinical trial descriptions. We propose an innovative privacy-aware data augmentation approach for LLM-based patient-trial matching (LLM-PTM), which balances the benefits of LLMs while ensuring the security and confidentiality of sensitive patient data. Our experiments demonstrate a 7.32% average improvement in performance using the proposed LLM-PTM method, and the generalizability to new data is improved by 12.12%. Additionally, we present case studies to further illustrate the effectiveness of our approach and provide a deeper understanding of its underlying principles.

摘要

患者与合适临床试验的匹配过程对于推进医学研究和提供最佳护理至关重要。然而,当前的方法面临着数据标准化、伦理考虑以及电子健康记录 (EHR) 与临床试验标准之间缺乏互操作性等挑战。在本文中,我们探讨了大型语言模型 (LLM) 通过利用其先进的自然语言生成能力来提高 EHR 与临床试验描述之间的兼容性,从而解决这些挑战的潜力。我们提出了一种基于隐私感知的数据增强方法,用于基于 LLM 的患者-试验匹配 (LLM-PTM),在确保敏感患者数据的安全性和机密性的同时,平衡了 LLM 的优势。我们的实验表明,使用所提出的 LLM-PTM 方法平均可提高 7.32%的性能,并且可将新数据的泛化能力提高 12.12%。此外,我们还提供了案例研究,以进一步说明我们方法的有效性,并深入了解其基本原理。

相似文献

1
Large Language Models for Healthcare Data Augmentation: An Example on Patient-Trial Matching.
AMIA Annu Symp Proc. 2024 Jan 11;2023:1324-1333. eCollection 2023.
3
Collaborative eHealth Meets Security: Privacy-Enhancing Patient Profile Management.
IEEE J Biomed Health Inform. 2017 Nov;21(6):1741-1749. doi: 10.1109/JBHI.2017.2655419. Epub 2017 Aug 7.
4
Safeguarding Confidentiality in Electronic Health Records.
Camb Q Healthc Ethics. 2017 Apr;26(2):337-341. doi: 10.1017/S0963180116000931.
5
Potential of Large Language Models in Health Care: Delphi Study.
J Med Internet Res. 2024 May 13;26:e52399. doi: 10.2196/52399.
7
Reliable generation of privacy-preserving synthetic electronic health record time series via diffusion models.
J Am Med Inform Assoc. 2024 Nov 1;31(11):2529-2539. doi: 10.1093/jamia/ocae229.
8
Generative Large Language Models in Electronic Health Records for Patient Care Since 2023: A Systematic Review.
medRxiv. 2024 Aug 19:2024.08.11.24311828. doi: 10.1101/2024.08.11.24311828.
9
BSF-EHR: Blockchain Security Framework for Electronic Health Records of Patients.
Sensors (Basel). 2021 Apr 19;21(8):2865. doi: 10.3390/s21082865.
10
Harnessing the Power of Large Language Models (LLMs) for Electronic Health Records (EHRs) Optimization.
Cureus. 2023 Jul 29;15(7):e42634. doi: 10.7759/cureus.42634. eCollection 2023 Jul.

引用本文的文献

2
Survey and improvement strategies for gene prioritization with large language models.
Bioinform Adv. 2025 Jun 24;5(1):vbaf148. doi: 10.1093/bioadv/vbaf148. eCollection 2025.
3
Large Language Model Architectures in Health Care: Scoping Review of Research Perspectives.
J Med Internet Res. 2025 Jun 19;27:e70315. doi: 10.2196/70315.
4
Enhancing Patient-Trial Matching With Large Language Models: A Scoping Review of Emerging Applications and Approaches.
JCO Clin Cancer Inform. 2025 Jun;9:e2500071. doi: 10.1200/CCI-25-00071. Epub 2025 Jun 9.
7
Synthetic data distillation enables the extraction of clinical information at scale.
NPJ Digit Med. 2025 May 10;8(1):267. doi: 10.1038/s41746-025-01681-4.
8
Uncertainty-Aware Pre-Trained Foundation Models for Patient Risk Prediction via Gaussian Process.
Proc Int World Wide Web Conf. 2024 May;2024(Companion):1162-1165. doi: 10.1145/3589335.3651456. Epub 2024 May 13.
9
Exploring Large Language Models and the Metaverse for Urologic Applications: Potential, Challenges, and the Path Forward.
Int Neurourol J. 2024 Nov;28(Suppl 2):S65-73. doi: 10.5213/inj.2448402.201. Epub 2024 Nov 30.
10
Matching patients to clinical trials with large language models.
Nat Commun. 2024 Nov 18;15(1):9074. doi: 10.1038/s41467-024-53081-z.

本文引用的文献

1
Towards Fair Patient-Trial Matching via Patient-Criterion Level Fairness Constraint.
AMIA Annu Symp Proc. 2024 Jan 11;2023:884-893. eCollection 2023.
2
Artificial intelligence in managing clinical trial design and conduct: Man and machine still on the learning curve?
Perspect Clin Res. 2021 Jan-Mar;12(1):1-3. doi: 10.4103/picr.PICR_312_20. Epub 2021 Jan 19.
3
Evaluation of an artificial intelligence clinical trial matching system in Australian lung cancer patients.
JAMIA Open. 2020 May 1;3(2):209-215. doi: 10.1093/jamiaopen/ooaa002. eCollection 2020 Jul.
4
Matching patients to clinical trials using semantically enriched document representation.
J Biomed Inform. 2020 May;105:103406. doi: 10.1016/j.jbi.2020.103406. Epub 2020 Mar 10.
5
Patient-Treatment Matching: Rationale and Results.
Alcohol Health Res World. 1994;18(4):287-295.
6
Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing.
Circ Cardiovasc Qual Outcomes. 2019 Jul;12(7):e005122. doi: 10.1161/CIRCOUTCOMES.118.005122. Epub 2019 Jul 9.
7
Criteria2Query: a natural language interface to clinical databases for cohort definition.
J Am Med Inform Assoc. 2019 Apr 1;26(4):294-305. doi: 10.1093/jamia/ocy178.
8
Privacy in the age of medical big data.
Nat Med. 2019 Jan;25(1):37-43. doi: 10.1038/s41591-018-0272-7. Epub 2019 Jan 7.
9
Promoting healthcare innovation on the demand side.
J Law Biosci. 2017 Jan 16;4(1):3-49. doi: 10.1093/jlb/lsw062. eCollection 2017 Apr.
10
MIMIC-III, a freely accessible critical care database.
Sci Data. 2016 May 24;3:160035. doi: 10.1038/sdata.2016.35.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验