Graubard B I, Korn E L
Biostatistics Branch, National Cancer Institute, Bethesda, MD 20892, USA.
J Natl Cancer Inst. 1999 Jun 16;91(12):1005-16. doi: 10.1093/jnci/91.12.1005.
Large-scale health surveys conducted by government agencies record information on a large number of health-related variables. We review the use of these data for performing analyses that address cancer-related objectives. After describing the conduct of a large-scale health survey (the third National Health and Nutrition Examination Survey [NHANES III]), we discuss some of the issues involved in analyzing data collected in such a survey. In particular, the use of sample weights in the analysis and the importance of accounting for the complex survey design when estimating standard errors are discussed. Six applications are then presented that involve the following: 1) estimating demographic factors associated with snuff use, 2) estimating the association of type of health insurance with the probability of receiving a digital rectal examination, 3) estimating the association of body iron stores with the probability of later developing cancer, 4) estimating the changing rates of mammography screening in the United States between 1987 and 1992, 5) evaluating smoking and alcohol consumption as risk factors for digestive cancer by use of a population-based, case-control study, and 6) evaluating a randomized community-intervention experiment to encourage smoking cessation. These applications use data from the National Health Interview Survey, the NHANES I Epidemiologic Followup Study, the 1986 National Mortality Followback Survey, and the Community Intervention Trial for Smoking Cessation. The availability of public-use data files is discussed for surveys sponsored by the U.S. government that collect health-related information. We demonstrate that statistical methods and computer software are available for analyzing public-use data files of surveys to address different types of cancer-related objectives.
政府机构开展的大规模健康调查记录了大量与健康相关的变量信息。我们回顾了利用这些数据进行分析以实现与癌症相关目标的情况。在描述了一项大规模健康调查(第三次全国健康与营养检查调查[NHANES III])的实施过程后,我们讨论了在此类调查中分析数据所涉及的一些问题。特别是,讨论了在分析中使用样本权重以及在估计标准误差时考虑复杂调查设计的重要性。然后介绍了六个应用,包括:1)估计与鼻烟使用相关的人口统计学因素;2)估计健康保险类型与接受直肠指检概率之间的关联;3)估计体内铁储存与日后患癌概率之间的关联;4)估计1987年至1992年间美国乳房X光筛查率的变化;5)通过基于人群的病例对照研究评估吸烟和饮酒作为消化系统癌症风险因素的情况;6)评估一项鼓励戒烟的随机社区干预实验。这些应用使用了来自国家健康访谈调查、NHANES I流行病学随访研究、1986年国家死亡率随访调查以及戒烟社区干预试验的数据。还讨论了美国政府赞助的收集健康相关信息的调查中公共使用数据文件的可用性。我们证明有统计方法和计算机软件可用于分析调查的公共使用数据文件,以实现不同类型的与癌症相关的目标。