Choi J W, McHugh R B
National Center for Health Statistics, Hyattsville, Maryland 20782.
Biometrics. 1989 Sep;45(3):979-96.
Situations often arise in a large-scale household survey where a complex probability sample of clusters rather than of individuals is drawn from a large population. Typically, the clusters of such complex samples include a number of correlated members. The responses of these members are then weighted to obtain estimates for the population. Such weighted data are commonly published by the National Center for Health Statistics and other U.S. federal agencies. Frequently, problems arise when such data are tested by usual chi-square test statistics for goodness of fit or independence. Researchers have discovered that the usual chi-square tests provide spuriously inflated results when applied to cluster samples and that new methods are required to correct such problems. This paper proposes a strategy for a goodness-of-fit or independence test based on correlated and weighted data arising in cluster samples, and provides a factor that validly reduces the inflation of the usual chi-square statistics. This method is applied to the chronic condition data collected from the St Paul-Minneapolis, Minnesota, primary sampling unit (PSU) during the 1975 National Health Interview Survey (NHIS). This analysis, together with simulation studies presented elsewhere, provides evidence that the usual chi-square statistics from such data can be corrected for the impacts of clustering and weighting by use of the proposed reduction factor.
在大规模家庭调查中,经常会出现这样的情况:从大量人口中抽取的是聚类的复杂概率样本,而不是个体的复杂概率样本。通常,此类复杂样本的聚类包含许多相关成员。然后对这些成员的回答进行加权,以获得总体估计值。此类加权数据通常由美国国家卫生统计中心和其他美国联邦机构发布。当使用常规卡方检验统计量对这类数据进行拟合优度或独立性检验时,经常会出现问题。研究人员发现,将常规卡方检验应用于聚类样本时会产生虚假的膨胀结果,因此需要新的方法来纠正此类问题。本文提出了一种基于聚类样本中出现的相关加权数据进行拟合优度或独立性检验的策略,并提供了一个有效降低常规卡方统计量膨胀的因子。该方法应用于1975年全国健康访谈调查(NHIS)期间从明尼苏达州圣保罗 - 明尼阿波利斯主要抽样单位(PSU)收集的慢性病数据。该分析以及其他地方提出的模拟研究表明,通过使用所提出的缩减因子,可以校正此类数据中常规卡方统计量因聚类和加权产生的影响。