Schrier Joshua, Norquist Alexander J, Buonassisi Tonio, Brgoch Jakoah
Department of Chemistry, Fordham University, The Bronx, New York 10458, United States.
Department of Chemistry, Haverford College, Haverford, Pennsylvania 19041, United States.
J Am Chem Soc. 2023 Oct 11;145(40):21699-21716. doi: 10.1021/jacs.3c04783. Epub 2023 Sep 27.
Exceptional molecules and materials with one or more extraordinary properties are both technologically valuable and fundamentally interesting, because they often involve new physical phenomena or new compositions that defy expectations. Historically, exceptionality has been achieved through serendipity, but recently, machine learning (ML) and automated experimentation have been widely proposed to accelerate target identification and synthesis planning. In this Perspective, we argue that the data-driven methods commonly used today are well-suited for optimization but not for the realization of new exceptional materials or molecules. Finding such outliers should be possible using ML, but only by shifting away from using traditional ML approaches that tweak the composition, crystal structure, or reaction pathway. We highlight case studies of high- oxide superconductors and superhard materials to demonstrate the challenges of ML-guided discovery and discuss the limitations of automation for this task. We then provide six recommendations for the development of ML methods capable of exceptional materials discovery: (i) Avoid the tyranny of the middle and focus on extrema; (ii) When data are limited, qualitative predictions that provide direction are more valuable than interpolative accuracy; (iii) Sample what can be made and how to make it and defer optimization; (iv) Create room (and look) for the unexpected while pursuing your goal; (v) Try to fill-in-the-blanks of input and output space; (vi) Do not confuse human understanding with model interpretability. We conclude with a description of how these recommendations can be integrated into automated discovery workflows, which should enable the discovery of exceptional molecules and materials.
具有一种或多种非凡特性的特殊分子和材料在技术上具有价值,在基础研究方面也饶有趣味,因为它们常常涉及违背预期的新物理现象或新成分。从历史上看,特殊性是通过偶然发现实现的,但最近,机器学习(ML)和自动化实验被广泛提议用于加速目标识别和合成规划。在这篇观点文章中,我们认为当今常用的数据驱动方法非常适合优化,但不适用于发现新的特殊材料或分子。使用机器学习应该有可能找到这类异常值,但前提是要摒弃那些调整成分、晶体结构或反应途径的传统机器学习方法。我们重点介绍了高氧化物超导体和超硬材料的案例研究,以展示机器学习引导发现的挑战,并讨论这项任务自动化的局限性。然后,我们针对能够发现特殊材料的机器学习方法的发展提出了六条建议:(i)避免中庸之道,关注极端情况;(ii)当数据有限时,提供方向的定性预测比插值精度更有价值;(iii)对可制造的材料及其制造方法进行采样,推迟优化;(iv)在追求目标的同时,为意外情况留出空间(并加以关注);(v)尝试填补输入和输出空间的空白;(vi)不要将人类的理解与模型的可解释性混为一谈。我们最后描述了如何将这些建议整合到自动化发现工作流程中,这应该能够实现特殊分子和材料的发现。