Department of Electrical and Computer Engineering, Northwestern University, Evanston, USA.
Computation Institute, University of Chicago, Chicago, USA.
Sci Rep. 2021 Feb 19;11(1):4244. doi: 10.1038/s41598-021-83193-1.
The application of machine learning (ML) techniques in materials science has attracted significant attention in recent years, due to their impressive ability to efficiently extract data-driven linkages from various input materials representations to their output properties. While the application of traditional ML techniques has become quite ubiquitous, there have been limited applications of more advanced deep learning (DL) techniques, primarily because big materials datasets are relatively rare. Given the demonstrated potential and advantages of DL and the increasing availability of big materials datasets, it is attractive to go for deeper neural networks in a bid to boost model performance, but in reality, it leads to performance degradation due to the vanishing gradient problem. In this paper, we address the question of how to enable deeper learning for cases where big materials data is available. Here, we present a general deep learning framework based on Individual Residual learning (IRNet) composed of very deep neural networks that can work with any vector-based materials representation as input to build accurate property prediction models. We find that the proposed IRNet models can not only successfully alleviate the vanishing gradient problem and enable deeper learning, but also lead to significantly (up to 47%) better model accuracy as compared to plain deep neural networks and traditional ML techniques for a given input materials representation in the presence of big data.
近年来,机器学习 (ML) 技术在材料科学中的应用引起了人们的极大关注,因为它们具有从各种输入材料表示中高效提取数据驱动关联并将其转化为输出属性的出色能力。虽然传统的 ML 技术的应用已经相当普遍,但高级深度学习 (DL) 技术的应用却很有限,主要是因为大型材料数据集相对较少。鉴于 DL 的潜力和优势,以及大型材料数据集的日益普及,人们希望通过使用更深的神经网络来提高模型性能,但实际上,由于梯度消失问题,这会导致性能下降。在本文中,我们探讨了在有大型材料数据的情况下如何实现深度学习的问题。为此,我们提出了一种基于个体残差学习 (IRNet) 的通用深度学习框架,该框架由非常深的神经网络组成,可以使用任何基于向量的材料表示作为输入来构建精确的属性预测模型。我们发现,所提出的 IRNet 模型不仅可以成功地缓解梯度消失问题并实现深度学习,而且与传统的 ML 技术相比,在存在大数据的情况下,对于给定的输入材料表示,模型精度可以显著提高(高达 47%)。