Suppr超能文献

电子健康记录中信息检索的经验教训:嵌入模型与池化策略的比较

Lessons learned on information retrieval in electronic health records: a comparison of embedding models and pooling strategies.

作者信息

Myers Skatje, Miller Timothy A, Gao Yanjun, Churpek Matthew M, Mayampurath Anoop, Dligach Dmitriy, Afshar Majid

机构信息

Department of Medicine, University of Wisconsin-Madison, Madison, WI 53726, United States.

Computational Health Informatics Program, Boston Children's Hospital, Boston, MA 02215, United States.

出版信息

J Am Med Inform Assoc. 2025 Feb 1;32(2):357-364. doi: 10.1093/jamia/ocae308.

Abstract

OBJECTIVES

Applying large language models (LLMs) to the clinical domain is challenging due to the context-heavy nature of processing medical records. Retrieval-augmented generation (RAG) offers a solution by facilitating reasoning over large text sources. However, there are many parameters to optimize in just the retrieval system alone. This paper presents an ablation study exploring how different embedding models and pooling methods affect information retrieval for the clinical domain.

MATERIALS AND METHODS

Evaluating on 3 retrieval tasks on 2 electronic health record (EHR) data sources, we compared 7 models, including medical- and general-domain models, specialized encoder embedding models, and off-the-shelf decoder LLMs. We also examine the choice of embedding pooling strategy for each model, independently on the query and the text to retrieve.

RESULTS

We found that the choice of embedding model significantly impacts retrieval performance, with BGE, a comparatively small general-domain model, consistently outperforming all others, including medical-specific models. However, our findings also revealed substantial variability across datasets and query text phrasings. We also determined the best pooling methods for each of these models to guide future design of retrieval systems.

DISCUSSION

The choice of embedding model, pooling strategy, and query formulation can significantly impact retrieval performance and the performance of these models on other public benchmarks does not necessarily transfer to new domains. The high variability in performance across different query phrasings suggests that the choice of query may need to be tuned and validated for each task, or even for each institution's EHR.

CONCLUSION

This study provides empirical evidence to guide the selection of models and pooling strategies for RAG frameworks in healthcare applications. Further studies such as this one are vital for guiding empirically-grounded development of retrieval frameworks, such as in the context of RAG, for the clinical domain.

摘要

目标

由于处理医疗记录需要大量上下文信息,将大语言模型(LLMs)应用于临床领域具有挑战性。检索增强生成(RAG)通过促进对大型文本源的推理提供了一种解决方案。然而,仅在检索系统中就有许多参数需要优化。本文提出了一项消融研究,探讨不同的嵌入模型和池化方法如何影响临床领域的信息检索。

材料与方法

在2个电子健康记录(EHR)数据源上的3个检索任务上进行评估,我们比较了7种模型,包括医学领域和通用领域模型、专门的编码器嵌入模型以及现成的解码器LLMs。我们还独立地针对查询和要检索的文本,研究了每个模型的嵌入池化策略选择。

结果

我们发现嵌入模型的选择对检索性能有显著影响,相对较小的通用领域模型BGE始终优于所有其他模型,包括医学专用模型。然而,我们的研究结果也揭示了不同数据集和查询文本措辞之间存在很大差异。我们还确定了这些模型各自的最佳池化方法,以指导未来检索系统的设计。

讨论

嵌入模型、池化策略和查询公式的选择会显著影响检索性能,并且这些模型在其他公共基准上的性能不一定能转移到新领域。不同查询措辞的性能差异很大,这表明可能需要针对每个任务甚至每个机构的EHR对查询选择进行调整和验证。

结论

本研究提供了实证证据,以指导医疗保健应用中RAG框架的模型和池化策略选择。此类进一步研究对于指导基于实证的检索框架开发至关重要,例如在临床领域的RAG背景下。

相似文献

5
Systemic treatments for metastatic cutaneous melanoma.转移性皮肤黑色素瘤的全身治疗
Cochrane Database Syst Rev. 2018 Feb 6;2(2):CD011123. doi: 10.1002/14651858.CD011123.pub2.
7
Interventions to reduce harm from continued tobacco use.减少持续吸烟危害的干预措施。
Cochrane Database Syst Rev. 2016 Oct 13;10(10):CD005231. doi: 10.1002/14651858.CD005231.pub3.
8
Measures implemented in the school setting to contain the COVID-19 pandemic.学校为控制 COVID-19 疫情而采取的措施。
Cochrane Database Syst Rev. 2022 Jan 17;1(1):CD015029. doi: 10.1002/14651858.CD015029.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验