Computer Center of Shanxi University, Taiyuan 030006, P.R.C.
Neural Comput. 2014 Jan;26(1):208-35. doi: 10.1162/NECO_a_00532. Epub 2013 Oct 8.
In the research of machine learning algorithms for classification tasks, the comparison of the performances of algorithms is extremely important, and a statistical test of significance for generalization error is often used to perform it in the machine learning literature. In view of the randomness of partitions in cross-validation, a new blocked 3×2 cross-validation is proposed to estimate generalization error in this letter. We then conduct an analysis of variance of the blocked 3×2 cross-validated estimator. A relatively conservative variance estimator that considers the correlation between any two two-fold cross-validations, and was previously neglected in 5×2 cross-validated t and F-tests is put forward. A corresponding test using this variance estimator is presented to compare the performances of algorithms. Simulated results show that the performance of our test is comparable with that of 5×2 cross-validated tests but with less computation complexity.
在分类任务的机器学习算法研究中,算法性能的比较非常重要,而在机器学习文献中,通常使用对泛化误差的显著性统计检验来进行比较。针对交叉验证中划分的随机性问题,本文提出了一种新的分块 3×2 交叉验证方法来估计泛化误差。然后,我们对分块 3×2 交叉验证估计量进行方差分析。提出了一种相对保守的方差估计量,它考虑了任何两个两重交叉验证之间的相关性,而在之前的 5×2 交叉验证 t 和 F 检验中被忽略了。提出了一种使用该方差估计量的相应检验方法,用于比较算法的性能。模拟结果表明,我们的检验方法的性能与 5×2 交叉验证检验方法相当,但计算复杂度较低。