Ramos Mayk Caldas, Collison Christopher J, White Andrew D
FutureHouse Inc. San Francisco CA USA
Department of Chemical Engineering, University of Rochester Rochester NY USA
Chem Sci. 2024 Dec 9;16(6):2514-2572. doi: 10.1039/d4sc03921a. eCollection 2025 Feb 5.
Large language models (LLMs) have emerged as powerful tools in chemistry, significantly impacting molecule design, property prediction, and synthesis optimization. This review highlights LLM capabilities in these domains and their potential to accelerate scientific discovery through automation. We also review LLM-based autonomous agents: LLMs with a broader set of tools to interact with their surrounding environment. These agents perform diverse tasks such as paper scraping, interfacing with automated laboratories, and synthesis planning. As agents are an emerging topic, we extend the scope of our review of agents beyond chemistry and discuss across any scientific domains. This review covers the recent history, current capabilities, and design of LLMs and autonomous agents, addressing specific challenges, opportunities, and future directions in chemistry. Key challenges include data quality and integration, model interpretability, and the need for standard benchmarks, while future directions point towards more sophisticated multi-modal agents and enhanced collaboration between agents and experimental methods. Due to the quick pace of this field, a repository has been built to keep track of the latest studies: https://github.com/ur-whitelab/LLMs-in-science.
大语言模型(LLMs)已成为化学领域的强大工具,对分子设计、性质预测和合成优化产生了重大影响。本综述重点介绍了大语言模型在这些领域的能力,以及它们通过自动化加速科学发现的潜力。我们还回顾了基于大语言模型的自主智能体:即拥有更广泛工具集以与周围环境交互的大语言模型。这些智能体执行各种任务,如文献抓取、与自动化实验室对接以及合成规划。由于智能体是一个新兴主题,我们将对智能体的综述范围扩展到化学领域之外,并在任何科学领域进行讨论。本综述涵盖了大语言模型和自主智能体的近期发展历程、当前能力及设计,探讨了化学领域的特定挑战、机遇和未来发展方向。关键挑战包括数据质量与整合、模型可解释性以及对标准基准的需求,而未来发展方向则指向更复杂的多模态智能体以及智能体与实验方法之间加强协作。鉴于该领域发展迅速,已建立一个知识库来跟踪最新研究:https://github.com/ur-whitelab/LLMs-in-science 。