使用多保真度数据进行顺序学习的代理。

Agents for sequential learning using multiple-fidelity data.

机构信息

Energy and Materials Division, Toyota Research Institute, Los Altos, USA.

Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, USA.

出版信息

Sci Rep. 2022 Mar 18;12(1):4694. doi: 10.1038/s41598-022-08413-8.

DOI:10.1038/s41598-022-08413-8

PMID:35304496

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8933401/

Abstract

Sequential learning for materials discovery is a paradigm where a computational agent solicits new data to simultaneously update a model in service of exploration (finding the largest number of materials that meet some criteria) or exploitation (finding materials with an ideal figure of merit). In real-world discovery campaigns, new data acquisition may be costly and an optimal strategy may involve using and acquiring data with different levels of fidelity, such as first-principles calculation to supplement an experiment. In this work, we introduce agents which can operate on multiple data fidelities, and benchmark their performance on an emulated discovery campaign to find materials with desired band gap values. The fidelities of data come from the results of DFT calculations as low fidelity and experimental results as high fidelity. We demonstrate performance gains of agents which incorporate multi-fidelity data in two contexts: either using a large body of low fidelity data as a prior knowledge base or acquiring low fidelity data in-tandem with experimental data. This advance provides a tool that enables materials scientists to test various acquisition and model hyperparameters to maximize the discovery rate of their own multi-fidelity sequential learning campaigns for materials discovery. This may also serve as a reference point for those who are interested in practical strategies that can be used when multiple data sources are available for active or sequential learning campaigns.

摘要

序贯学习在材料发现中的应用是一种范例，其中计算代理会请求新数据，以同时更新模型，以实现探索（找到满足某些标准的最大数量的材料）或利用（找到具有理想优值的材料）。在实际的发现活动中，新数据的获取可能很昂贵，并且最佳策略可能涉及使用和获取具有不同保真度的数据，例如第一性原理计算来补充实验。在这项工作中，我们引入了可以在多个数据保真度下运行的代理，并在模拟发现活动中对其进行基准测试，以找到具有所需带隙值的材料。数据的保真度来自 DFT 计算的结果（低保真度）和实验结果（高保真度）。我们在两个方面展示了整合多保真度数据的代理的性能提升：要么使用大量的低保真度数据作为先验知识库，要么在获取低保真度数据的同时获取实验数据。这一进展提供了一个工具，使材料科学家能够测试各种获取和模型超参数，以最大化他们自己的多保真度序贯学习活动的发现率。对于那些对多数据源可用于主动或序贯学习活动时可以使用的实际策略感兴趣的人来说，这也可能是一个参考点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20da/8933401/4fc854950aae/41598_2022_8413_Fig1_HTML.jpg

相似文献

Agents for sequential learning using multiple-fidelity data.使用多保真度数据进行顺序学习的代理。

Sci Rep. 2022 Mar 18;12(1):4694. doi: 10.1038/s41598-022-08413-8.

Multifidelity Information Fusion with Machine Learning: A Case Study of Dopant Formation Energies in Hafnia.基于机器学习的多保真信息融合：以氧化铪中掺杂剂形成能为例的研究

ACS Appl Mater Interfaces. 2019 Jul 17;11(28):24906-24918. doi: 10.1021/acsami.9b02174. Epub 2019 Apr 16.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Learning properties of ordered and disordered materials from multi-fidelity data.从多保真度数据中学习有序和无序材料的特性。

Nat Comput Sci. 2021 Jan;1(1):46-53. doi: 10.1038/s43588-020-00002-x. Epub 2021 Jan 14.

Multi-fidelity machine learning for predicting bandgaps of nonlinear optical crystals.用于预测非线性光学晶体带隙的多保真度机器学习

Phys Chem Chem Phys. 2024 Jun 6;26(22):16378-16387. doi: 10.1039/d4cp00590b.

Exploring the potential of transfer learning for metamodels of heterogeneous material deformation.探索迁移学习在异质材料变形元模型中的潜力。

J Mech Behav Biomed Mater. 2021 May;117:104276. doi: 10.1016/j.jmbbm.2020.104276. Epub 2020 Dec 31.

Autonomous intelligent agents for accelerated materials discovery.用于加速材料发现的自主智能体。

Chem Sci. 2020 Jul 30;11(32):8517-8532. doi: 10.1039/d0sc01101k.

Materials science optimization benchmark dataset for multi-objective, multi-fidelity optimization of hard-sphere packing simulations.用于硬球堆积模拟多目标、多保真度优化的材料科学优化基准数据集。

Data Brief. 2023 Aug 10;50:109487. doi: 10.1016/j.dib.2023.109487. eCollection 2023 Oct.

Machine learning for impurity charge-state transition levels in semiconductors from elemental properties using multi-fidelity datasets.利用多保真度数据集从元素特性预测半导体中杂质电荷态跃迁能级的机器学习方法

J Chem Phys. 2022 Mar 21;156(11):114110. doi: 10.1063/5.0083877.

MF-PCBA: Multifidelity High-Throughput Screening Benchmarks for Drug Discovery and Machine Learning.MF-PCBA：药物发现和机器学习的多保真度高通量筛选基准

J Chem Inf Model. 2023 May 8;63(9):2667-2678. doi: 10.1021/acs.jcim.2c01569. Epub 2023 Apr 14.

引用本文的文献

PAH101: A GW+BSE Dataset of 101 Polycyclic Aromatic Hydrocarbon (PAH) Molecular Crystals.PAH101：一个包含101种多环芳烃（PAH）分子晶体的基因关联研究与牛海绵状脑病数据集

Sci Data. 2025 Apr 23;12(1):679. doi: 10.1038/s41597-025-04959-0.

Machine Learning-Aided Inverse Design and Discovery of Novel Polymeric Materials for Membrane Separation.机器学习辅助的用于膜分离的新型高分子材料逆设计与发现

Environ Sci Technol. 2025 Jan 21;59(2):993-1012. doi: 10.1021/acs.est.4c08298. Epub 2024 Dec 16.

Emerging Trends in Machine Learning: A Polymer Perspective.机器学习的新兴趋势：聚合物视角

ACS Polym Au. 2023 Jan 18;3(3):239-258. doi: 10.1021/acspolymersau.2c00053. eCollection 2023 Jun 14.

本文引用的文献

Learning properties of ordered and disordered materials from multi-fidelity data.从多保真度数据中学习有序和无序材料的特性。

Nat Comput Sci. 2021 Jan;1(1):46-53. doi: 10.1038/s43588-020-00002-x. Epub 2021 Jan 14.

Machine learning-driven new material discovery.机器学习驱动的新材料发现。

Nanoscale Adv. 2020 Jun 22;2(8):3115-3130. doi: 10.1039/d0na00388c. eCollection 2020 Aug 11.

Autonomous intelligent agents for accelerated materials discovery.用于加速材料发现的自主智能体。

Chem Sci. 2020 Jul 30;11(32):8517-8532. doi: 10.1039/d0sc01101k.

Benchmarking the acceleration of materials discovery by sequential learning.通过序列学习对材料发现加速进行基准测试。

Chem Sci. 2020 Jan 29;11(10):2696-2706. doi: 10.1039/c9sc05999g.

On-the-fly closed-loop materials discovery via Bayesian active learning.通过贝叶斯主动学习实现即时闭环材料发现

Nat Commun. 2020 Nov 24;11(1):5966. doi: 10.1038/s41467-020-19597-w.

Combinatorial Exploration and Mapping of Phase Transformation in a Ni-Ti-Co Thin Film Library.组合式探索和绘制 Ni-Ti-Co 薄膜库中的相变。

ACS Comb Sci. 2020 Nov 9;22(11):641-648. doi: 10.1021/acscombsci.0c00097. Epub 2020 Aug 20.

ChemOS: An orchestration software to democratize autonomous discovery.ChemOS：一个使自主发现民主化的编排软件。

PLoS One. 2020 Apr 16;15(4):e0229862. doi: 10.1371/journal.pone.0229862. eCollection 2020.

Toward Predicting Intermetallics Surface Properties with High-Throughput DFT and Convolutional Neural Networks.通过高通量 DFT 和卷积神经网络预测金属间化合物表面性质。

J Chem Inf Model. 2019 Nov 25;59(11):4742-4749. doi: 10.1021/acs.jcim.9b00550. Epub 2019 Nov 5.

Autonomous Discovery in the Chemical Sciences Part I: Progress.自主发现在化学科学中的应用第一部分：进展

Angew Chem Int Ed Engl. 2020 Dec 14;59(51):22858-22893. doi: 10.1002/anie.201909987. Epub 2020 Jun 8.

Predicting Adsorption Energies Using Multifidelity Data.利用多保真数据预测吸附能。

J Chem Theory Comput. 2019 Oct 8;15(10):5588-5600. doi: 10.1021/acs.jctc.9b00336. Epub 2019 Sep 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用多保真度数据进行顺序学习的代理。

Agents for sequential learning using multiple-fidelity data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献