Nilsson Anton, Bonander Carl, Strömberg Ulf, Björk Jonas
Epidemiology, Population studies and Infrastructures (EPI@LUND), Tornblad Building, Lund University, Biskopsgatan 9, Hämtställe 21, SE-22362, Lund, Sweden.
Centre for Economic Demography, Lund University, Lund, Sweden.
BMC Public Health. 2020 Dec 17;20(1):1918. doi: 10.1186/s12889-020-10004-z.
In any study with voluntary participation, self-selection risks leading to invalid conclusions. If the determinants of selection are observed, it is however possible to restore the parameters of interest by reweighting the sample to match the population, but this approach has seldom been applied in epidemiological research.
We reweighted the Malmö Diet and Cancer (MDC) study based on population register data on background variables, including socio-demographics and hospital admissions for both participants and the background population. Following individuals from baseline in 1991-1996 and at most until 2016, we studied mortality (all-cause, cancer, and CVD), incidences (cancer and CVD), and associations between these outcomes and background variables. Results from the unweighted and reweighted participant sample were compared with those from the background population.
Mortality was substantially lower in participants than in the background population, but reweighting the sample helped only little to make the numbers similar to those in the background population. For incidences and associations, numbers were generally similar between participants and the background population already without reweighting, rendering reweighting unnecessary.
Reweighting samples based on an extensive range of sociodemographic characteristics and previous hospitalizations does not necessarily yield results that are valid for the population as a whole. In the case of MDC, there appear to be important factors related to both mortality and selection into the study that are not observable in registry data, making it difficult to obtain accurate numbers on population mortality based on cohort participants. These issues seem less relevant for incidences and associations, however. Overall, our results suggest that representativeness must be judged on a case-by-case basis.
在任何自愿参与的研究中,自我选择都有得出无效结论的风险。然而,如果观察到选择的决定因素,通过对样本重新加权以匹配总体,就有可能恢复感兴趣的参数,但这种方法在流行病学研究中很少应用。
我们根据关于背景变量的人口登记数据,对马尔默饮食与癌症(MDC)研究进行重新加权,背景变量包括社会人口统计学以及参与者和背景人群的住院情况。对1991年至1996年基线时的个体进行随访,最长至2016年,我们研究了死亡率(全因、癌症和心血管疾病)、发病率(癌症和心血管疾病)以及这些结局与背景变量之间的关联。将未加权和重新加权的参与者样本的结果与背景人群的结果进行比较。
参与者的死亡率显著低于背景人群,但对样本重新加权仅在很小程度上有助于使数字与背景人群的数字相似。对于发病率和关联,在未重新加权的情况下,参与者和背景人群之间的数字通常已经相似,因此无需重新加权。
基于广泛的社会人口学特征和既往住院情况对样本进行重新加权,不一定能得出对总体有效的结果。就MDC而言,似乎存在与死亡率和参与研究的选择都相关的重要因素,而这些因素在登记数据中无法观察到,这使得难以根据队列参与者获得准确的总体死亡率数字。然而,这些问题对于发病率和关联似乎不太相关。总体而言,我们的结果表明,代表性必须逐案判断。