Jammalamadaka S Rao, Guerrier Stéphane, Mangalam Vasudevan
Department of Statistics & Applied Probability, University of California, Santa Barbara, USA.
Geneva School of Economics and Management, Faculty of Science, University of Geneva, Geneva, Switzerland.
Sankhya B (2008). 2021;83(Suppl 1):140-166. doi: 10.1007/s13571-020-00244-9. Epub 2021 Feb 13.
A nonparametric test labelled 'Rao Spacing-frequencies test' is explored and developed for testing whether two circular samples come from the same population. Its exact distribution and performance relative to comparable tests such as the Wheeler-Watson test and the Dixon test in small samples, are discussed. Although this test statistic is shown to be asymptotically normal, as one would expect, this large sample distribution does not provide satisfactory approximations for small to moderate samples. Exact critical values for small samples are obtained and tables provided here, using combinatorial techniques, and asymptotic critical regions are assessed against these. For moderate sample sizes in-between i.e. when the samples are too large making combinatorial techniques computationally prohibitive but yet asymptotic regions do not provide a good approximation, we provide a simple Monte Carlo procedure that gives very accurate critical values. As is well-known, the large number of usual rank-based tests are not applicable in the context of circular data since the values of such ranks depend on the arbitrary choice of origin and the sense of rotation used (clockwise or anti-clockwise). Tests that are invariant under the group of rotations, depend on the data through the so-called 'spacing frequencies', the frequencies of one sample that fall in between the spacings (or gaps) made by the other. The Wheeler-Watson, Dixon, and the proposed Rao tests are of this form and are explicitly useful for circular data, but they also have the added advantage of being valid and useful for comparing any two samples on the real line. Our study and simulations establish the 'Rao spacing-frequencies test' as a desirable, and indeed preferable test in a wide variety of contexts for comparing two circular samples, and as a viable competitor even for data on the real line. Computational help for implementing any of these tests, is made available online "TwoCircles" R package and is part of this paper.
一种名为“Rao间距频率检验”的非参数检验方法被探索并开发出来,用于检验两个圆形样本是否来自同一总体。文中讨论了该检验相对于其他可比检验(如惠勒 - 沃森检验和迪克森检验)在小样本情况下的精确分布和性能。正如预期的那样,尽管该检验统计量被证明是渐近正态的,但这种大样本分布对于中小样本并不能提供令人满意的近似值。本文利用组合技术获得了小样本的精确临界值并给出了表格,同时评估了渐近临界区域与这些精确临界值的对比情况。对于中等样本量(即样本太大以至于组合技术在计算上难以实现,但渐近区域又不能提供良好近似值的情况),我们提供了一种简单的蒙特卡罗方法,该方法能给出非常精确的临界值。众所周知,大量常见的基于秩的检验在圆形数据的情况下不适用,因为这些秩的值取决于原点的任意选择以及所使用的旋转方向(顺时针或逆时针)。在旋转群下不变的检验,通过所谓的“间距频率”依赖于数据,即一个样本落在另一个样本所形成的间距(或间隙)之间的频率。惠勒 - 沃森检验、迪克森检验以及本文提出的Rao检验都属于这种形式,并且对于圆形数据具有明确的实用性,但它们还有一个额外的优点,即对于比较实线上的任意两个样本也是有效且有用的。我们的研究和模拟表明,“Rao间距频率检验”在各种比较两个圆形样本的情况下都是一种理想且更优的检验方法,甚至对于实线上的数据也是一个可行的竞争者。实现这些检验的计算帮助可在在线“TwoCircles”R包中获取,该包也是本文的一部分。