Department of Gynecology and Center for Hereditary Breast and Ovarian Cancer, Technical University of Munich (TUM), School of Medicine and Health, Klinikum rechts der Isar, TUM University Hospital, Munich, Germany.
Center for Personalized Medicine (ZPM), Technical University of Munich (TUM), School of Medicine and Health, Klinikum rechts der Isar, TUM University Hospital, Munich, Germany.
JCO Precis Oncol. 2024 Oct;8:e2400478. doi: 10.1200/PO-24-00478. Epub 2024 Oct 30.
Rapidly expanding medical literature challenges oncologists seeking targeted cancer therapies. General-purpose large language models (LLMs) lack domain-specific knowledge, limiting their clinical utility. This study introduces the LLM system Medical Evidence Retrieval and Data Integration for Tailored Healthcare (MEREDITH), designed to support treatment recommendations in precision oncology. Built on LLM, MEREDITH uses and .
We evaluated MEREDITH on 10 publicly available fictional oncology cases with iterative feedback from a molecular tumor board (MTB) at a major German cancer center. Initially limited to -indexed literature (draft system), MEREDITH was enhanced to incorporate clinical studies on drug response within the specific tumor type, trial databases, drug approval status, and oncologic guidelines. The MTB provided a benchmark with manually curated treatment recommendations and assessed the clinical relevance of LLM-generated options (qualitative assessment). We measured semantic cosine similarity between LLM suggestions and clinician responses (quantitative assessment).
MEREDITH identified a broader range of treatment options (median 4) compared with MTB experts (median 2). These options included therapies on the basis of preclinical data and combination treatments, expanding the treatment possibilities for consideration by the MTB. This broader approach was achieved by incorporating a curated medical data set that contextualized molecular targetability. Mirroring the approach MTB experts use to evaluate MTB cases improved the LLM's ability to generate relevant suggestions. This is supported by high concordance between LLM suggestions and expert recommendations (94.7% for the enhanced system) and a significant increase in semantic similarity from the draft to the enhanced system (from 0.71 to 0.76, = .01).
Expert feedback and domain-specific data augment LLM performance. Future research should investigate responsible LLM integration into real-world clinical workflows.
快速发展的医学文献给寻求靶向癌症疗法的肿瘤学家带来了挑战。通用的大型语言模型(LLM)缺乏特定于领域的知识,限制了它们的临床实用性。本研究介绍了 LLM 系统 Medical Evidence Retrieval and Data Integration for Tailored Healthcare(MEREDITH),旨在支持精准肿瘤学中的治疗建议。基于 LLM,MEREDITH 使用 和 。
我们使用迭代反馈方法,由德国一家主要癌症中心的分子肿瘤委员会(MTB)对 10 个公开的虚构肿瘤病例进行了 MEREDITH 评估。最初仅限于索引文献(草案系统),MEREDITH 得到了增强,可以纳入特定肿瘤类型的药物反应临床研究、试验数据库、药物批准状况和肿瘤学指南。MTB 提供了一个基准,其中包含经过精心策划的治疗建议,并评估了 LLM 生成选项的临床相关性(定性评估)。我们测量了 LLM 建议和临床医生回复之间的语义余弦相似性(定量评估)。
与 MTB 专家相比(中位数 2),MEREDITH 确定了更广泛的治疗选择(中位数 4)。这些选择包括基于临床前数据的治疗方法和联合治疗,扩大了 MTB 考虑的治疗可能性。这种更广泛的方法是通过整合一个精心策划的医学数据集来实现的,该数据集对分子可靶向性进行了上下文化处理。模仿 MTB 专家用于评估 MTB 病例的方法提高了 LLM 生成相关建议的能力。这得到了以下方面的支持:LLM 建议与专家建议之间的高度一致性(增强系统为 94.7%),以及从草案系统到增强系统的语义相似性显著增加(从 0.71 增加到 0.76, =.01)。
专家反馈和特定于领域的数据增强了 LLM 的性能。未来的研究应该调查负责任的 LLM 整合到现实世界的临床工作流程中。