School of Materials Science and Engineering, Beihang University, Beijing 100191, China.
Center for Integrated Computational Materials Engineering, International Research Institute for Multidisciplinary Science, Beihang University, Beijing 100191, China.
J Phys Chem Lett. 2022 May 12;13(18):3965-3977. doi: 10.1021/acs.jpclett.2c00576. Epub 2022 Apr 28.
Machine learning (ML) is believed to have enabled a paradigm shift in materials research, and in practice, ML has demonstrated its power in speeding up the cost-efficient discovery of new materials and autonomizing materials laboratories. In this Perspective, current research progress in materials data which are the backbones of ML are reviewed, focusing on high-throughput data generation, standardized data storage, and data representation. More importantly, the challenging issues in materials data that should be overcome to unlock the full potential of ML in materials research and development, including classic 5V (volume, velocity, variety, veracity, and value) issues, 3M (multicomponent, multiscale, and multistage) challenges, co-mining of experimental and computational data, and materials data toward transferable/explainable ML or causal ML, are discussed.
机器学习(ML)被认为已经实现了材料研究领域的范式转变,在实践中,ML 已经证明了其在加速低成本新材料发现和实现材料实验室自动化方面的强大功能。在本观点中,回顾了材料数据方面的当前研究进展,这些数据是 ML 的基础,重点介绍了高通量数据生成、标准化数据存储和数据表示。更重要的是,讨论了材料数据中需要克服的具有挑战性的问题,以释放 ML 在材料研究和开发中的全部潜力,包括经典的 5V(体积、速度、多样性、准确性和价值)问题、3M(多组分、多尺度和多阶段)挑战、实验数据和计算数据的共同挖掘,以及朝着可转移/可解释 ML 或因果 ML 的材料数据。