Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, Bethesda, Maryland, USA.
Clinical Monitoring Research Program Directorate, Frederick National Laboratory for Cancer Research, Frederick, Maryland, USA.
Stat Med. 2021 Feb 28;40(5):1147-1159. doi: 10.1002/sim.8829. Epub 2020 Dec 1.
For testing with paired data (eg, twins randomized between two treatments), a simple test is the sign test, where we test if the distribution of the sign of the differences in responses between the two treatments within pairs is more often positive (favoring one treatment) or negative (favoring the other). When the responses are binary, this reduces to a McNemar-type test, and the calculations are the same. Although it is easy to calculate an exact P-value by conditioning on the total number of discordant pairs, the accompanying confidence interval on a parameter of interest (proportion positive minus proportion negative) is not straightforward. Effect estimates and confidence intervals are important for interpretation because it is possible that the treatment helps a very small proportion of the population yet gives a highly significant effect. We construct a confidence interval that is compatible with an exact sign test, meaning the 100 interval excludes the null hypothesis of equality of proportions if and only if the associated exact sign test rejects at level . We conjecture that the proposed confidence intervals guarantee nominal coverage, and we support that conjecture with extensive numerical calculations, but we have no mathematical proof to show guaranteed coverage. We have written and made available the function mcnemarExactDP in the exact2x2 R package and the function signTest in the asht R package to perform the methods described in this article.
对于配对数据的测试(例如,将双胞胎随机分配到两种治疗方法之间),一种简单的测试方法是符号检验,我们可以通过检验配对内两种治疗方法之间的响应差异的符号分布更常为正(有利于一种治疗方法)还是负(有利于另一种治疗方法)来进行测试。当响应为二分类时,这简化为麦克内玛型检验,并且计算是相同的。虽然通过对不匹配对的总数进行条件化计算,可以轻松计算出确切的 P 值,但对感兴趣的参数(阳性比例减去阴性比例)的伴随置信区间并不简单。效果估计和置信区间对于解释很重要,因为治疗可能对一小部分人群有帮助,但却产生了非常显著的效果。我们构建了一个与精确符号检验兼容的置信区间,这意味着,如果相关的精确符号检验在水平 处拒绝,则 100%的区间将排除比例相等的零假设。我们推测所提出的置信区间保证了名义覆盖范围,并且我们通过广泛的数值计算支持了这一推测,但我们没有数学证明来显示保证的覆盖范围。我们已经在 exact2x2 R 包中编写并提供了函数 mcnemarExactDP,并在 asht R 包中编写了函数 signTest,以执行本文中描述的方法。