Sapsis Themistoklis P
Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA.
Proc Math Phys Eng Sci. 2020 Feb;476(2234):20190834. doi: 10.1098/rspa.2019.0834. Epub 2020 Feb 19.
For many important problems the quantity of interest is an unknown function of the parameters, which is a random vector with known statistics. Since the dependence of the output on this random vector is unknown, the challenge is to identify its statistics, using the minimum number of function evaluations. This problem can be seen in the context of active learning or optimal experimental design. We employ Bayesian regression to represent the derived model uncertainty due to finite and small number of input-output pairs. In this context we evaluate existing methods for optimal sample selection, such as model error minimization and mutual information maximization. We show that for the case of known output variance, the commonly employed criteria in the literature do not take into account the output values of the existing input-output pairs, while for the case of unknown output variance this dependence can be very weak. We introduce a criterion that takes into account the values of the output for the existing samples and adaptively selects inputs from regions of the parameter space which have an important contribution to the output. The new method allows for application to high-dimensional inputs, paving the way for optimal experimental design in high dimensions.
对于许多重要问题,感兴趣的量是参数的未知函数,该参数是具有已知统计量的随机向量。由于输出对该随机向量的依赖性未知,挑战在于使用最少数量的函数评估来识别其统计量。这个问题可以在主动学习或最优实验设计的背景下看到。我们采用贝叶斯回归来表示由于有限且少量的输入 - 输出对而产生的模型不确定性。在此背景下,我们评估现有的最优样本选择方法,例如模型误差最小化和互信息最大化。我们表明,对于已知输出方差的情况,文献中常用的标准没有考虑现有输入 - 输出对的输出值,而对于未知输出方差的情况,这种依赖性可能非常弱。我们引入一种考虑现有样本输出值的标准,并从对输出有重要贡献的参数空间区域中自适应地选择输入。新方法允许应用于高维输入,为高维最优实验设计铺平了道路。