Gontscharuk Veronika, Landwehr Sandra, Finner Helmut
Department of Statistics in Medicine, Faculty of Medicine, Heinrich-Heine-University Düsseldorf, Universitätsstr. 1, 40225 Düsseldorf, Germany; German Diabetes Center, Institute for Biometrics and Epidemiology, Auf'm Hennekamp 65, 40225 Düsseldorf, Germany.
Biom J. 2015 Jan;57(1):159-80. doi: 10.1002/bimj.201300255. Epub 2014 Jun 10.
The higher criticism (HC) statistic, which can be seen as a normalized version of the famous Kolmogorov-Smirnov statistic, has a long history, dating back to the mid seventies. Originally, HC statistics were used in connection with goodness of fit (GOF) tests but they recently gained some attention in the context of testing the global null hypothesis in high dimensional data. The continuing interest for HC seems to be inspired by a series of nice asymptotic properties related to this statistic. For example, unlike Kolmogorov-Smirnov tests, GOF tests based on the HC statistic are known to be asymptotically sensitive in the moderate tails, hence it is favorably applied for detecting the presence of signals in sparse mixture models. However, some questions around the asymptotic behavior of the HC statistic are still open. We focus on two of them, namely, why a specific intermediate range is crucial for GOF tests based on the HC statistic and why the convergence of the HC distribution to the limiting one is extremely slow. Moreover, the inconsistency in the asymptotic and finite behavior of the HC statistic prompts us to provide a new HC test that has better finite properties than the original HC test while showing the same asymptotics. This test is motivated by the asymptotic behavior of the so-called local levels related to the original HC test. By means of numerical calculations and simulations we show that the new HC test is typically more powerful than the original HC test in normal mixture models.
高等批评(HC)统计量可被视为著名的柯尔莫哥洛夫-斯米尔诺夫统计量的归一化版本,其历史悠久,可追溯到20世纪70年代中期。最初,HC统计量与拟合优度(GOF)检验相关联使用,但最近在高维数据的全局原假设检验背景下受到了一些关注。对HC的持续兴趣似乎源于与该统计量相关的一系列良好的渐近性质。例如,与柯尔莫哥洛夫-斯米尔诺夫检验不同,基于HC统计量的GOF检验在中等尾部已知是渐近敏感的,因此它有利于应用于检测稀疏混合模型中的信号存在。然而,围绕HC统计量渐近行为的一些问题仍然悬而未决。我们关注其中两个问题,即为什么特定的中间范围对基于HC统计量的GOF检验至关重要,以及为什么HC分布向极限分布的收敛极其缓慢。此外,HC统计量在渐近和有限行为上的不一致促使我们提供一种新的HC检验,该检验具有比原始HC检验更好的有限性质,同时显示相同的渐近性。这种检验是由与原始HC检验相关的所谓局部水平的渐近行为所激发的。通过数值计算和模拟,我们表明在正态混合模型中,新的HC检验通常比原始HC检验更具功效。