Pickard Joshua, Prakash Ram, Choi Marc Andrew, Oliven Natalie, Stansbury Cooper, Cwycyshyn Jillian, Galioto Nicholas, Gorodetsky Alex, Velasquez Alvaro, Rajapakse Indika
Gilbert S. Omenn Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, United States.
Department of Mathematics, University of Michigan, Ann Arbor, MI 48109, United States.
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf159.
Integrating Large Language Models (LLMs) with research tools presents technical and reproducibility challenges for biomedical research. While commercial artificial intelligence (AI) systems are easy to adopt, they obscure data provenance, lack transparency, and can generates false information, making them unfit for many research problems. To address these challenges, we developed the Bioinformatics Retrieval Augmented Digital (BRAD) agent software system.
Here, we introduce BRAD, an agentic system that integrates LLMs with external tools and data to streamline research workflows. BRAD's modular agents retrieve information from literature, custom software, and online databases while maintaining transparent protocols to increase the reliability of AI generated results. We apply BRAD to a biomarker discovery pipeline, automating both execution and the generation of enrichment reports. This workflow contextualizes user data within the literature, enabling a level of interpretation and automation that surpasses conventional research tools. Beyond the workflow we highlight here, BRAD is a flexible system that has been deployed in other applications including a chatbot, video RAG, and analysis of single cell data.
The source code for BRAD is available at https://github.com/Jpickard1/BRAD; Information for pip installation, tutorials, documentation, and further information can be found at: ReadTheDocs.
将大语言模型(LLMs)与研究工具集成给生物医学研究带来了技术和可重复性挑战。虽然商业人工智能(AI)系统易于采用,但它们掩盖了数据来源,缺乏透明度,并且可能生成虚假信息,使其不适用于许多研究问题。为应对这些挑战,我们开发了生物信息检索增强数字(BRAD)代理软件系统。
在此,我们介绍BRAD,这是一个将大语言模型与外部工具和数据集成以简化研究工作流程的代理系统。BRAD的模块化代理从文献、定制软件和在线数据库中检索信息,同时保持透明协议以提高人工智能生成结果的可靠性。我们将BRAD应用于生物标志物发现流程,实现执行和富集报告生成的自动化。此工作流程将用户数据置于文献背景中,实现了超越传统研究工具的解释和自动化水平。除了我们在此强调的工作流程之外,BRAD是一个灵活的系统,已部署在其他应用中,包括聊天机器人、视频检索与生成(RAG)以及单细胞数据分析。
BRAD的源代码可在https://github.com/Jpickard1/BRAD获取;有关pip安装、教程、文档和更多信息可在ReadTheDocs找到。