Jun Hyeji, Tanaka Yutaro, Johri Shreya, Carvalho Filipe Lf, Jordan Alexander C, Labaki Chris, Nagy Matthew, O'Meara Tess A, Pappa Theodora, Pimenta Erica Maria, Saad Eddy, Yang David D, Gillani Riaz, Tewari Alok K, Reardon Brendan, Van Allen Eliezer
Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02115, USA.
Broad Institute of Harvard and MIT, Cambridge, MA 02142, USA.
medRxiv. 2025 Jul 24:2025.05.09.25327312. doi: 10.1101/2025.05.09.25327312.
The rapid expansion of molecularly informed therapies in oncology, coupled with evolving regulatory FDA approvals, poses a challenge for oncologists seeking to integrate precision cancer medicine into patient care. Large Language Models (LLMs) have demonstrated potential for clinical applications, but their reliance on general knowledge limits their ability to provide up-to-date and niche treatment recommendations. To address this challenge, we developed a RAG-LLM workflow augmented with Molecular Oncology Almanac (MOAlmanac), a curated precision oncology knowledge resource, and evaluated this approach relative to alternative frameworks (i.e. LLM-only) in making biomarker-driven treatment recommendations using both unstructured and structured data. We evaluated performance across 234 therapy-biomarker relationships. Finally, we assessed real-world applicability of the workflow by testing it on actual queries from practicing oncologists. While LLM-only achieved 62-75% accuracy in biomarker-driven treatment recommendations, RAG-LLM achieved 79-91% accuracy with an unstructured database and 94-95% accuracy with a structured database. In addition to accuracy, structured context augmentation significantly increased precision (49% to 80%) and F1-score (57% to 84%) compared to unstructured data augmentation. In queries provided by practicing oncologists, RAG-LLM achieved 81-90% accuracy. These findings demonstrate that the RAG-LLM framework effectively delivers precise and reliable FDA-approved precision oncology therapy recommendations grounded in individualized clinical data, and highlight the importance of integrating a well-curated, structured knowledge base in this process. While our RAG-LLM approach significantly improved accuracy compared to standard LLMs, further efforts will enhance the generation of reliable responses for ambiguous or unsupported clinical scenarios.
肿瘤学中分子导向疗法的迅速扩展,再加上美国食品药品监督管理局(FDA)不断演变的监管批准,给寻求将精准癌症医学纳入患者护理的肿瘤学家带来了挑战。大语言模型(LLMs)已显示出临床应用潜力,但其对一般知识的依赖限制了它们提供最新和特定治疗建议的能力。为应对这一挑战,我们开发了一种基于分子肿瘤学年鉴(MOAlmanac)增强的RAG-LLM工作流程,MOAlmanac是一个经过整理的精准肿瘤学知识资源,并在使用非结构化和结构化数据进行生物标志物驱动的治疗建议时,相对于替代框架(即仅使用LLM)评估了这种方法。我们评估了234种治疗-生物标志物关系的性能。最后,我们通过对执业肿瘤学家的实际问题进行测试,评估了该工作流程在现实世界中的适用性。虽然仅使用LLM在生物标志物驱动的治疗建议中准确率为62%-75%,但RAG-LLM在非结构化数据库中的准确率为79%-91%,在结构化数据库中的准确率为94%-95%。除了准确率外,与非结构化数据增强相比,结构化上下文增强显著提高了精确率(从49%提高到80%)和F1分数(从57%提高到84%)。在执业肿瘤学家提供的问题中,RAG-LLM的准确率为81%-90%。这些发现表明,RAG-LLM框架有效地提供了基于个体化临床数据的精确且可靠的FDA批准的精准肿瘤学治疗建议,并突出了在此过程中整合精心整理的结构化知识库的重要性。虽然我们的RAG-LLM方法与标准LLMs相比显著提高了准确率,但进一步的努力将增强对模糊或无支持的临床场景生成可靠回答的能力。