基于证据的知识综合与假设验证：通过可解释人工智能和智能体系统探索生物医学知识库。

Evidence-based Knowledge Synthesis and Hypothesis Validation: Navigating Biomedical Knowledge Bases via Explainable AI and Agentic Systems.

作者信息

Pelletier Alexander R, Ramirez Joseph, Sankar Baradwaj Simha, Adam Irsyad, Yan Yu, Steinecke Dylan, Wang Wei, Watson Karol E, Ping Peipei

机构信息

Department of Physiology, UCLA School of Medicine; Scalable Analytics Institute (ScAi) at Department of Computer Science, UCLA School of Engineering;

Department of Physiology, UCLA School of Medicine.

出版信息

J Vis Exp. 2025 Jun 13(220). doi: 10.3791/67525.

DOI:10.3791/67525

PMID:40587410

Abstract

The scale of biomedical knowledge, spanning scientific literature and curated knowledge bases, poses a significant challenge for investigators in processing, evaluating, and interpreting findings effectively. Large Language Models (LLMs) have emerged as powerful tools for navigating this complex knowledge landscape but may produce hallucinatory responses. Retrieval-Augmented Generation (RAG) is essential for identifying relevant information to enhance accuracy and reliability. This protocol introduces RUGGED (Retrieval Under Graph-Guided Explainable disease Distinction), a comprehensive workflow designed to support knowledge integration, to mitigate bias, and to explore and validate new research directions. Biomedical information from publications and knowledge bases are synthesized and analyzed through text-mining association analysis and explainable graph prediction models to uncover potential drug-disease relationships. These findings, along with the source text corpus and knowledge bases, are incorporated into a framework that employs RAG-enhanced LLMs to enables users to explore hypotheses and investigate underlying mechanisms. A clinical use case demonstrates RUGGED's capability in evaluating and recommending therapeutics for Arrhythmogenic Cardiomyopathy (ACM) and Dilated Cardiomyopathy (DCM), analyzing prescribed drugs for molecular interactions and potential new applications. The platform reduces LLM hallucinations, highlights actionable insights, and streamlines the investigation of novel therapeutics.

摘要

生物医学知识的规模，涵盖科学文献和经过整理的知识库，给研究人员在有效处理、评估和解释研究结果方面带来了重大挑战。大语言模型（LLMs）已成为驾驭这一复杂知识领域的强大工具，但可能会产生幻觉性回答。检索增强生成（RAG）对于识别相关信息以提高准确性和可靠性至关重要。本方案介绍了RUGGED（图引导可解释疾病区分下的检索），这是一个旨在支持知识整合、减轻偏差以及探索和验证新研究方向的综合工作流程。通过文本挖掘关联分析和可解释图预测模型，对来自出版物和知识库的生物医学信息进行综合分析，以发现潜在的药物 - 疾病关系。这些发现，连同源文本语料库和知识库，被纳入一个采用RAG增强的大语言模型的框架，使用户能够探索假设并研究潜在机制。一个临床用例展示了RUGGED在评估和推荐致心律失常性心肌病（ACM）和扩张型心肌病（DCM）的治疗方法、分析处方药的分子相互作用和潜在新应用方面的能力。该平台减少了大语言模型的幻觉，突出了可采取行动的见解，并简化了新型治疗方法的研究。