Wang Lan, Peng Bo, Li Runze
Associate Professor, School of Statistics, University of Minnesota, Minneapolis, MN 55455.
Graduate student, School of Statistics, University of Minnesota, Minneapolis, MN 55455.
J Am Stat Assoc. 2015;110(512):1658-1669. doi: 10.1080/01621459.2014.988215. Epub 2016 Jan 15.
This work is concerned with testing the population mean vector of nonnormal high-dimensional multivariate data. Several tests for high-dimensional mean vector, based on modifying the classical Hotelling test, have been proposed in the literature. Despite their usefulness, they tend to have unsatisfactory power performance for heavy-tailed multivariate data, which frequently arise in genomics and quantitative finance. This paper proposes a novel high-dimensional nonparametric test for the population mean vector for a general class of multivariate distributions. With the aid of new tools in modern probability theory, we proved that the limiting null distribution of the proposed test is normal under mild conditions when is substantially larger than . We further study the local power of the proposed test and compare its relative efficiency with a modified Hotelling test for high-dimensional data. An interesting finding is that the newly proposed test can have even more substantial power gain with large than the traditional nonparametric multivariate test does with finite fixed . We study the finite sample performance of the proposed test via Monte Carlo simulations. We further illustrate its application by an empirical analysis of a genomics data set.
这项工作关注于检验非正态高维多元数据的总体均值向量。文献中已经提出了几种基于修改经典霍特林检验的高维均值向量检验方法。尽管它们很有用,但对于重尾多元数据,它们的功效表现往往不尽人意,而重尾多元数据在基因组学和定量金融中经常出现。本文针对一般类别的多元分布,提出了一种用于总体均值向量的新型高维非参数检验方法。借助现代概率论中的新工具,我们证明了在温和条件下,当 远大于 时,所提出检验的极限零分布是正态的。我们进一步研究了所提出检验的局部功效,并将其相对效率与用于高维数据的修改后的霍特林检验进行比较。一个有趣的发现是,新提出的检验在 较大时,相比于传统非参数多元检验在有限固定 时,能获得更大的功效提升。我们通过蒙特卡罗模拟研究了所提出检验的有限样本性能。我们还通过对一个基因组学数据集的实证分析进一步说明了它的应用。