RTI Health Solutions, Research Triangle Institute, Research Triangle Park, NC 27709, USA.
Ann Epidemiol. 2013 Feb;23(2):43-8. doi: 10.1016/j.annepidem.2012.11.005. Epub 2012 Dec 6.
Beryllium's classification as a carcinogen is based on limited human data that show inconsistent associations with lung cancer. Therefore, a thorough examination of those data is warranted. We reanalyzed data from the largest study of occupational beryllium exposure, conducted by the National Institute of Occupational Safety and Health (NIOSH).
Data had been analyzed using stratification and standardization. We reviewed the strata in the original analysis, and reanalyzed using fewer strata. We also fit a Poisson regression, and analyzed simulated datasets that generated lung cancer cases randomly without regard to exposure.
The strongest association reported in the NIOSH study, a standardized rate ratio for death from lung cancer of 3.68 for the highest versus lowest category of time since first employment, is affected by sparse-data bias, stemming from stratifying 545 lung cancer cases and their associated person-time into 1792 categories. For time since first employment, the measure of beryllium exposure with the strongest reported association with lung cancer, there were no strata without zeroes in at least one of the two contrasting exposure categories. Reanalysis using fewer strata or with regression models gave substantially smaller effect estimates. Simulations confirmed that the original stratified analysis was upwardly biased. Other metrics used in the NIOSH study found weaker associations and were less affected by sparse-data bias.
The strongest association reported in the NIOSH study seems to be biased as a result of non-overlap of data across the numerous strata. Simulation results indicate that most of the effect reported in the NIOSH paper for time since first employment is attributable to sparse-data bias.
铍被归类为致癌物的依据是有限的人类数据,这些数据显示与肺癌之间的关联并不一致。因此,有必要对这些数据进行彻底审查。我们重新分析了美国国家职业安全与健康研究所(NIOSH)进行的最大规模职业铍暴露研究的数据。
数据曾采用分层和标准化方法进行分析。我们对原始分析中的分层进行了审查,并采用较少的分层重新进行了分析。我们还拟合了泊松回归,并分析了模拟数据集,这些数据集随机生成肺癌病例,而不考虑暴露情况。
NIOSH 研究报告的最强关联是,最高与最低首次就业后时间类别相比,死于肺癌的标准化率比为 3.68,这受到稀疏数据偏倚的影响,源于将 545 例肺癌病例及其相关的个体时间分层为 1792 个类别。对于首次就业后的时间,与肺癌关联最强的铍暴露量指标,在两个对比暴露类别中,至少有一个类别中没有零的分层。使用较少的分层或回归模型进行重新分析,会得出较小的效果估计值。模拟结果证实,原始分层分析存在向上偏倚。NIOSH 研究中使用的其他指标发现的关联较弱,受稀疏数据偏倚的影响较小。
NIOSH 研究报告的最强关联似乎存在偏倚,原因是数据在众多分层中没有重叠。模拟结果表明,NIOSH 论文中报告的首次就业后时间的大部分效应归因于稀疏数据偏倚。