Suppr超能文献

使用大型语言模型学习为患者匹配临床试验。

Learning to match patients to clinical trials using large language models.

机构信息

CSIRO, Data61, 26 Pembroke Rd, Marsfield, 2122, NSW, Australia.

TU Wien, Favoritenstrasse 9-11, Vienna, 1040, Austria.

出版信息

J Biomed Inform. 2024 Nov;159:104734. doi: 10.1016/j.jbi.2024.104734. Epub 2024 Oct 9.

Abstract

OBJECTIVE

This study investigates the use of Large Language Models (LLMs) for matching patients to clinical trials (CTs) within an information retrieval pipeline. Our objective is to enhance the process of patient-trial matching by leveraging the semantic processing capabilities of LLMs, thereby improving the effectiveness of patient recruitment for clinical trials.

METHODS

We employed a multi-stage retrieval pipeline integrating various methodologies, including BM25 and Transformer-based rankers, along with LLM-based methods. Our primary datasets were the TREC Clinical Trials 2021-23 track collections. We compared LLM-based approaches, focusing on methods that leverage LLMs in query formulation, filtering, relevance ranking, and re-ranking of CTs.

RESULTS

Our results indicate that LLM-based systems, particularly those involving re-ranking with a fine-tuned LLM, outperform traditional methods in terms of nDCG and Precision measures. The study demonstrates that fine-tuning LLMs enhances their ability to find eligible trials. Moreover, our LLM-based approach is competitive with state-of-the-art systems in the TREC challenges. The study shows the effectiveness of LLMs in CT matching, highlighting their potential in handling complex semantic analysis and improving patient-trial matching. However, the use of LLMs increases the computational cost and reduces efficiency. We provide a detailed analysis of effectiveness-efficiency trade-offs.

CONCLUSION

This research demonstrates the promising role of LLMs in enhancing the patient-to-clinical trial matching process, offering a significant advancement in the automation of patient recruitment. Future work should explore optimising the balance between computational cost and retrieval effectiveness in practical applications.

摘要

目的

本研究旨在探讨大型语言模型(LLMs)在信息检索管道中用于匹配患者与临床试验(CTs)的应用。我们的目标是通过利用 LLM 的语义处理能力来增强患者与试验的匹配过程,从而提高临床试验的患者招募效果。

方法

我们采用了一个多阶段的检索管道,整合了各种方法,包括 BM25 和基于 Transformer 的排名器,以及基于 LLM 的方法。我们的主要数据集是 TREC 临床试验 2021-23 轨道集合。我们比较了基于 LLM 的方法,重点关注在查询制定、过滤、相关性排名和 CT 重新排名中利用 LLM 的方法。

结果

我们的结果表明,基于 LLM 的系统,特别是涉及使用微调的 LLM 进行重新排名的系统,在 nDCG 和 Precision 度量方面优于传统方法。研究表明,微调 LLM 增强了它们找到合格试验的能力。此外,我们的基于 LLM 的方法在 TREC 挑战中与最先进的系统具有竞争力。研究表明 LLM 在 CT 匹配中的有效性,突出了它们在处理复杂语义分析和改善患者与试验匹配方面的潜力。然而,使用 LLM 会增加计算成本并降低效率。我们提供了关于有效性与效率权衡的详细分析。

结论

本研究证明了 LLM 在增强患者与临床试验匹配过程中的有前途的作用,为患者招募的自动化提供了重大进展。未来的工作应探索在实际应用中优化计算成本与检索效果之间的平衡。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验