Gopakumar Abhijith M, Balachandran Prasanna V, Xue Dezhen, Gubernatis James E, Lookman Turab
Los Alamos National Laboratory, Theoretical Division, Los Alamos, 87545, USA.
State Key Laboratory for Mechanical Behavior of Materials, Xian Jiaotong University, Xian, 710049, China.
Sci Rep. 2018 Feb 27;8(1):3738. doi: 10.1038/s41598-018-21936-3.
Guiding experiments to find materials with targeted properties is a crucial aspect of materials discovery and design, and typically multiple properties, which often compete, are involved. In the case of two properties, new compounds are sought that will provide improvement to existing data points lying on the Pareto front (PF) in as few experiments or calculations as possible. Here we address this problem by using the concept and methods of optimal learning to determine their suitability and performance on three materials data sets; an experimental data set of over 100 shape memory alloys, a data set of 223 MAX phases obtained from density functional theory calculations, and a computational data set of 704 piezoelectric compounds. We show that the Maximin and Centroid design strategies, based on value of information criteria, are more efficient in determining points on the PF from the data than random selection, pure exploitation of the surrogate model prediction or pure exploration by maximum uncertainty from the learning model. Although the datasets varied in size and source, the Maximin algorithm showed superior performance across all the data sets, particularly when the accuracy of the machine learning model fits were not high, emphasizing that the design appears to be quite forgiving of relatively poor surrogate models.
指导实验以寻找具有目标特性的材料是材料发现和设计的关键环节,通常涉及多个往往相互竞争的特性。对于两种特性的情况,人们寻求新的化合物,以便在尽可能少的实验或计算中改进位于帕累托前沿(PF)上的现有数据点。在此,我们通过使用最优学习的概念和方法来解决这个问题,以确定其在三个材料数据集上的适用性和性能;一个包含100多种形状记忆合金的实验数据集、一个通过密度泛函理论计算获得的223个MAX相的数据集以及一个包含704种压电化合物的计算数据集。我们表明,基于信息准则值的极大极小和质心设计策略,在从数据中确定PF上的点时,比随机选择、单纯利用代理模型预测或通过学习模型的最大不确定性进行纯探索更有效。尽管数据集在大小和来源上各不相同,但极大极小算法在所有数据集中都表现出卓越的性能,特别是当机器学习模型拟合的准确性不高时,这强调了该设计对于相对较差的代理模型似乎相当宽容。