Institute for Medical Informatics, Statistics and Epidemiology, University of Leipzig, Härtelstr. 16-18, D-04107 Leipzig, Germany.
Biostatistics. 2013 Jan;14(1):129-43. doi: 10.1093/biostatistics/kxs030. Epub 2012 Sep 6.
Signal identification in large-dimensional settings is a challenging problem in biostatistics. Recently, the method of higher criticism (HC) was shown to be an effective means for determining appropriate decision thresholds. Here, we study HC from a false discovery rate (FDR) perspective. We show that the HC threshold may be viewed as an approximation to a natural class boundary (CB) in two-class discriminant analysis which in turn is expressible as the FDR threshold. We demonstrate that in a rare-weak setting in the region of the phase space where signal identification is possible, both thresholds are practicably indistinguishable, and thus HC thresholding is identical to using a simple local FDR cutoff. The relationship of the HC and CB thresholds and their properties are investigated both analytically and by simulations, and are further compared by the application to four cancer gene expression data sets.
在生物统计学中,大维度环境下的信号识别是一个具有挑战性的问题。最近,高阶批评(HC)方法已被证明是确定适当决策阈值的有效手段。在这里,我们从错误发现率(FDR)的角度来研究 HC。我们表明,HC 阈值可以看作是二类判别分析中自然类边界(CB)的近似值,而自然类边界(CB)又可以表示为 FDR 阈值。我们证明,在信号识别可能的相位空间区域的稀有-微弱环境中,这两个阈值在实践中是无法区分的,因此 HC 阈值与使用简单的局部 FDR 截止值相同。通过分析和模拟研究了 HC 和 CB 阈值的关系及其性质,并通过对四个癌症基因表达数据集的应用进行了进一步比较。