Suppr超能文献

功能输出回归在材料科学中的机器学习应用。

Functional Output Regression for Machine Learning in Materials Science.

机构信息

Department of Statistical Science, The Graduate University for Advanced Studies, Tachikawa190-8562, Japan.

Production Management Headquarters, Process Technology Division, Daicel Corporation, Himeji671-1283, Japan.

出版信息

J Chem Inf Model. 2022 Oct 24;62(20):4837-4851. doi: 10.1021/acs.jcim.2c00626. Epub 2022 Oct 10.

Abstract

In recent years, there has been a rapid growth in the use of machine learning in material science. Conventionally, a trained predictive model describes a scalar output variable, such as thermodynamic, electronic, or mechanical properties, as a function of input descriptors that vectorize the compositional or structural features of any given material, such as molecules, chemical compositions, or crystalline systems. In machine learning of material data, on the other hand, the output variable is often given as a function. For example, when predicting the optical absorption spectrum of a molecule, the output variable is a spectral function defined in the wavelength domain. Alternatively, in predicting the microstructure of a polymer nanocomposite, the output variable is given as an image from an electron microscope, which can be represented as a two- or three-dimensional function in the image coordinate system. In this study, we consider two unified frameworks to handle such multidimensional or functional output regressions, which are applicable to a wide range of predictive analyses in material science. The first approach employs generative adversarial networks, which are known to exhibit outstanding performance in various computer vision tasks such as image generation, style transfer, and video generation. We also present another type of statistical modeling inspired by a statistical methodology referred to as functional data analysis. This is an extension of kernel regression to deal with functional outputs, and its simple mathematical structure makes it effective in modeling even with small amounts of data. We demonstrate the proposed methods through several case studies in materials science.

摘要

近年来,机器学习在材料科学中的应用迅速发展。传统上,经过训练的预测模型将标量输出变量(如热力学、电子或机械性质)描述为输入描述符的函数,这些输入描述符将任何给定材料的组成或结构特征矢量化,例如分子、化学成分或晶体系统。另一方面,在材料数据的机器学习中,输出变量通常作为函数给出。例如,在预测分子的光吸收光谱时,输出变量是在波长域中定义的光谱函数。或者,在预测聚合物纳米复合材料的微观结构时,输出变量给出了电子显微镜的图像,该图像可以表示为图像坐标系中的二维或三维函数。在本研究中,我们考虑了两种统一的框架来处理这种多维或函数输出回归,这些框架适用于材料科学中广泛的预测分析。第一种方法采用生成对抗网络,该网络在各种计算机视觉任务(如图像生成、样式转换和视频生成)中表现出色。我们还提出了另一种基于统计方法学的统计建模类型,称为函数数据分析。这是核回归的扩展,用于处理函数输出,其简单的数学结构使其在处理少量数据时也非常有效。我们通过材料科学中的几个案例研究展示了所提出的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验