Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, Massachusetts 02139, United States.
Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge CB3 0WB, U.K.
J Chem Inf Model. 2020 Aug 24;60(8):3770-3780. doi: 10.1021/acs.jcim.0c00502. Epub 2020 Aug 4.
Uncertainty quantification (UQ) is an important component of molecular property prediction, particularly for drug discovery applications where model predictions direct experimental design and where unanticipated imprecision wastes valuable time and resources. The need for UQ is especially acute for neural models, which are becoming increasingly standard yet are challenging to interpret. While several approaches to UQ have been proposed in the literature, there is no clear consensus on the comparative performance of these models. In this paper, we study this question in the context of regression tasks. We systematically evaluate several methods on five regression data sets using multiple complementary performance metrics. Our experiments show that none of the methods we tested is unequivocally superior to all others, and none produces a particularly reliable ranking of errors across multiple data sets. While we believe that these results show that existing UQ methods are not sufficient for all common use cases and further research is needed, we conclude with a practical recommendation as to which existing techniques seem to perform well relative to others.
不确定性量化 (UQ) 是分子性质预测的一个重要组成部分,特别是在药物发现应用中,模型预测指导实验设计,而意外的不准确性会浪费宝贵的时间和资源。对于神经网络模型来说,UQ 的需求尤其迫切,因为神经网络模型越来越标准,但却难以解释。尽管文献中已经提出了几种 UQ 方法,但对于这些模型的相对性能并没有明确的共识。在本文中,我们在回归任务的背景下研究了这个问题。我们使用多个补充性能指标,在五个回归数据集上系统地评估了几种方法。我们的实验表明,我们测试的方法没有一种是完全优于其他方法的,也没有一种方法能够在多个数据集上对误差进行特别可靠的排序。虽然我们认为这些结果表明现有的 UQ 方法并不适用于所有常见用例,需要进一步研究,但我们最后提出了一个实用的建议,即哪些现有技术相对于其他技术似乎表现良好。