Pitman Samuel J, Evans Alicia K, Ireland Robbie T, Lempriere Felix, McKemmish Laura K
School of Chemistry, University of New South Wales, Sydney, NSW 2052, Australia.
J Phys Chem A. 2023 Dec 7;127(48):10295-10306. doi: 10.1021/acs.jpca.3c05573. Epub 2023 Nov 20.
Basis sets are a crucial but often largely overlooked choice in setting up quantum chemistry calculations. The choice of the basis set can be critical in determining the accuracy and calculation time of your quantum chemistry calculations. Clear recommendations based on thorough benchmarking are essential but not readily available currently. This study investigates the relative quality of basis sets for general properties by benchmarking basis set performance for a diverse set of 139 reactions (from the diet-150-GMTKN55 data set). In our analysis, we find the distributions of errors are often significantly non-Gaussian, meaning that the joint consideration of median errors, mean absolute errors, and outlier statistics is helpful to provide a holistic understanding of basis set performance. Our direct comparison of performance between most modern basis sets provides quantitative evidence for basis set recommendations that broadly align with the established understanding of basis set experts and is evident in the design of modern basis sets. For example, while zeta is a good measure of quality, it is not the only determining factor for an accurate calculation with unpolarized double- and triple-ζ basis sets (like 6-31G and 6-311G) having very poor performance. Appropriate use of polarization functions (e.g., 6-31G*) is essential to obtain the accuracy offered by double- or triple-ζ basis sets. In our study, the best performances for double- and triple-ζ basis sets are 6-31++G** and pcseg-2, respectively. However, the performances of singly polarized double-ζ and doubly polarized triple-ζ basis sets are quite similar with one key exception: the polarized 6-311G basis set family has poor parametrization, which means its performance is more like a double-ζ than a triple-ζ basis set. All versions of the 6-311G basis set family should be avoided entirely for valence chemistry calculations moving forward.
基组是进行量子化学计算时一个至关重要但常常被严重忽视的选择。基组的选择对于确定量子化学计算的准确性和计算时间可能至关重要。基于全面基准测试的明确建议至关重要,但目前尚不容易获得。本研究通过对139个不同反应(来自diet-150-GMTKN55数据集)的基组性能进行基准测试,研究了用于一般性质的基组的相对质量。在我们的分析中,我们发现误差分布通常明显非高斯分布,这意味着综合考虑中位数误差、平均绝对误差和异常值统计有助于全面了解基组性能。我们对大多数现代基组之间的性能进行直接比较,为基组建议提供了定量证据,这些建议与基组专家的既定理解大致一致,并且在现代基组的设计中很明显。例如,虽然ζ是质量的一个很好的衡量标准,但它不是使用未极化双ζ和三ζ基组(如6-31G和6-311G)进行准确计算的唯一决定因素,这些基组的性能非常差。适当使用极化函数(例如6-31G*)对于获得双ζ或三ζ基组提供的准确性至关重要。在我们研究中,双ζ和三ζ基组的最佳性能分别是6-31++G**和pcseg-2。然而,单极化双ζ和双极化三ζ基组的性能非常相似,但有一个关键例外:极化的6-311G基组家族参数化不佳,这意味着其性能更像双ζ基组而不是三ζ基组。对于未来的价层化学计算,应完全避免使用6-311G基组家族的所有版本。