Pernot Pascal, Savin Andreas
Institut de Chimie Physique, UMR8000, CNRS, Université Paris-Saclay, 91405 Orsay, France.
Laboratoire de Chimie Théorique, CNRS and UPMC Université Paris 06, Sorbonne Universités, 75252 Paris, France.
J Chem Phys. 2020 Apr 30;152(16):164109. doi: 10.1063/5.0006204.
In Paper I [P. Pernot and A. Savin, J. Chem. Phys. 152, 164108 (2020)], we introduced the systematic improvement probability as a tool to assess the level of improvement on absolute errors to be expected when switching between two computational chemistry methods. We also developed two indicators based on robust statistics to address the uncertainty of ranking in computational chemistry benchmarks: P, the inversion probability between two values of a statistic, and P, the ranking probability matrix. In this second part, these indicators are applied to nine data sets extracted from the recent benchmarking literature. We also illustrate how the correlation between the error sets might contain useful information on the benchmark dataset quality, notably when experimental data are used as reference.
在第一篇论文[P. 佩尔诺和A. 萨万,《化学物理杂志》152, 164108 (2020)]中,我们引入了系统改进概率作为一种工具,用于评估在两种计算化学方法之间切换时预期绝对误差的改进水平。我们还基于稳健统计开发了两个指标,以解决计算化学基准测试中排名的不确定性:P,统计量两个值之间的反转概率,以及P,排名概率矩阵。在第二部分中,这些指标应用于从近期基准测试文献中提取的九个数据集。我们还说明了误差集之间的相关性如何可能包含有关基准数据集质量的有用信息,特别是当使用实验数据作为参考时。