Agresti A
Department of Statistics, University of Florida, Gainesville, Florida 32611-8545, USA.
Stat Methods Med Res. 2003 Jan;12(1):3-21. doi: 10.1191/0962280203sm311ra.
'Exact' methods for categorical data are exact in terms of using probability distributions that do not depend on unknown parameters. However, they are conservative inferentially. The actual error probabilities for tests and confidence intervals are bounded above by the nominal level. This article examines the conservatism for interval estimation and describes ways of reducing it. We illustrate for confidence intervals for several basic parameters, including the binomial parameter, the difference between two binomial parameters for independent samples, and the odds ratio and relative risk. Less conservative behavior results from devices such as (1) inverting tests using statistics that are 'less discrete', (2) inverting a single two-sided test rather than two separate one-sided tests each having size at least half the nominal level, (3) using unconditional rather than conditional methods (where appropriate) and (4) inverting tests using alternative p-values. The article concludes with recommendations for selecting an interval in three situations-when one needs to guarantee a lower bound on a coverage probability, when it is sufficient to have actual coverage probability near the nominal level, and when teaching in a classroom or consulting environment.
分类数据的“精确”方法在使用不依赖于未知参数的概率分布方面是精确的。然而,它们在推断上是保守的。检验和置信区间的实际误差概率以名义水平为上限。本文研究区间估计的保守性,并描述降低保守性的方法。我们针对几个基本参数的置信区间进行说明,包括二项式参数、独立样本的两个二项式参数之差、优势比和相对风险。通过诸如以下方法可减少保守性:(1)使用“离散性较小”的统计量进行检验求逆,(2)对单个双侧检验求逆,而不是对两个各自大小至少为名义水平一半的单侧检验分别求逆,(3)在适当情况下使用无条件方法而非条件方法,以及(4)使用替代p值进行检验求逆。本文最后针对三种情况给出了选择区间的建议——当需要保证覆盖概率的下限、当实际覆盖概率接近名义水平就足够以及在课堂教学或咨询环境中时。