Universitätsklinikum Hamburg-Eppendorf, Institut für Medizinische Biometrie und Epidemiologie, Hamburg, Germany.
Charité - Universitätsmedizin Berlin, Institut für Biometrie und klinische Epidemiologie, Berlin, Germany.
Biom J. 2021 Mar;63(3):514-527. doi: 10.1002/bimj.201900190. Epub 2020 Nov 5.
National mortality statistics commonly provide disease-specific absolute and relative frequencies of death by sex and age, but not by exposure status. However, it is often of interest to know how many of the diseased individuals, that is the cases, were exposed or not exposed to a specific risk factor. We present two methods to estimate the proportion and the number of exposed and nonexposed cases, both of which require an estimate of the exposure prevalence in the nondiseased population. Method I additionally requires an estimate of the relative effect of exposure, that is a relative risk function if the exposure has a continuous distribution, or a relative risk estimate for each category if the exposure is categorical. Method II additionally requires an estimate of the disease rate among the nonexposed. We provide theoretical justifications, discuss practical limitations, and provide an R script to calculate the probability for nonexposure among the diseased, and compare the approaches. Both methods are subsequently applied to the estimation of the number of never smokers among lung cancer deaths. The two suggested methods rely on the availability of specific data sources and might therefore be applicable in different research settings. Both methods yield unbiased estimates of the number of nonexposed cases, given that the respective underlying assumptions are fulfilled.
国家死亡率统计数据通常按性别和年龄提供特定疾病的死亡绝对和相对频率,但不按暴露状况提供。然而,通常人们会感兴趣地了解有多少患病个体,即病例,接触或未接触特定的危险因素。我们提出了两种方法来估计暴露和未暴露病例的比例和数量,这两种方法都需要估计非患病人群中的暴露流行率。方法 I 还需要估计暴露的相对效果,即如果暴露具有连续分布,则为相对风险函数,如果暴露是分类的,则为每个类别的相对风险估计。方法 II 还需要估计未暴露者的疾病发生率。我们提供了理论依据,讨论了实际限制,并提供了一个 R 脚本,用于计算患病者中无暴露的概率,并比较了这些方法。随后,这两种方法都应用于估计肺癌死亡中从不吸烟者的数量。这两种建议的方法都依赖于特定数据源的可用性,因此可能适用于不同的研究环境。只要满足各自的基本假设,这两种方法都可以对未暴露病例的数量进行无偏估计。