Fong Youyi, Huang Ying, Lemos Maria P, Mcelrath M Juliana
Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, 1100 Fairview Ave N., Seattle, WA 98109, USA.
Biostatistics. 2018 Jul 1;19(3):281-294. doi: 10.1093/biostatistics/kxx039.
Two-sample location problem is one of the most encountered problems in statistical practice. The two most commonly studied subtypes of two-sample location problem involve observations from two populations that are either independent or completely paired, but a third subtype can oftentimes occur in practice when some observations are paired and some are not. Partially paired two-sample problems, also known as paired two-sample problems with missing data, often arise in biomedical fields when it is difficult for some invasive procedures to collect data from an individual at both conditions we are interested in comparing. Existing rank-based two-sample comparison procedures for partially paired data, however, do not make efficient use of all available data. In order to improve the power of testing procedures for this problem, we propose several new rank-based test statistics and study their asymptotic distributions and, when necessary, exact variances. Through extensive numerical studies, we show that the best overall power come from the proposed tests based on weighted linear combinations of the test statistics comparing paired data and the test statistics comparing independent data, using weights inversely proportional to their variances. We illustrate the proposed methods with a real data example from HIV research for prevention.
两样本位置问题是统计实践中最常遇到的问题之一。两样本位置问题最常研究的两种子类型涉及来自两个独立或完全配对总体的观测值,但在实践中,当一些观测值配对而一些观测值未配对时,第三种子类型通常会出现。部分配对两样本问题,也称为具有缺失数据的配对两样本问题,在生物医学领域经常出现,因为对于某些侵入性程序来说,很难在我们感兴趣比较的两种条件下从个体收集数据。然而,现有的针对部分配对数据的基于秩的两样本比较程序并没有有效地利用所有可用数据。为了提高针对此问题的检验程序的功效,我们提出了几种新的基于秩的检验统计量,并研究它们的渐近分布,必要时还研究其精确方差。通过广泛的数值研究,我们表明,总体功效最佳的是基于比较配对数据的检验统计量和比较独立数据的检验统计量的加权线性组合所提出的检验,权重与它们的方差成反比。我们用一个来自艾滋病预防研究的真实数据示例来说明所提出的方法。