Glass D C, Gray C N
Department of Epidemiology and Preventive Medicine, Monash University, Victoria, Australia.
Ann Occup Hyg. 2001 Jun;45(4):275-82.
A retrospective assessment of exposure to benzene was carried out for a nested case control study of lympho-haematopoietic cancers, including leukaemia, in the Australian petroleum industry. Each job or task in the industry was assigned a Base Estimate (BE) of exposure derived from task-based personal exposure assessments carried out by the company occupational hygienists. The BEs corresponded to the estimated arithmetic mean exposure to benzene for each job or task and were used in a deterministic algorithm to estimate the exposure of subjects in the study. Nearly all of the data sets underlying the BEs were found to contain some values below the limit of detection (LOD) of the sampling and analytical methods and some were very heavily censored; up to 95% of the data were below the LOD in some data sets. It was necessary, therefore, to use a method of calculating the arithmetic mean exposures that took into account the censored data. Three different methods were employed in an attempt to select the most appropriate method for the particular data in the study. A common method is to replace the missing (censored) values with half the detection limit. This method has been recommended for data sets where much of the data are below the limit of detection or where the data are highly skewed; with a geometric standard deviation of 3 or more. Another method, involving replacing the censored data with the limit of detection divided by the square root of 2, has been recommended when relatively few data are below the detection limit or where data are not highly skewed. A third method that was examined is Cohen's method. This involves mathematical extrapolation of the left-hand tail of the distribution, based on the distribution of the uncensored data, and calculation of the maximum likelihood estimate of the arithmetic mean. When these three methods were applied to the data in this study it was found that the first two simple methods give similar results in most cases. Cohen's method on the other hand, gave results that were generally, but not always, higher than simpler methods and in some cases gave extremely high and even implausible estimates of the mean. It appears that if the data deviate substantially from a simple log-normal distribution, particularly if high outliers are present, then Cohen's method produces erratic and unreliable estimates. After examining these results, and both the distributions and proportions of censored data, it was decided that the half limit of detection method was most suitable in this particular study.
针对澳大利亚石油行业中包括白血病在内的淋巴造血系统癌症的巢式病例对照研究,开展了一项苯暴露的回顾性评估。该行业中的每项工作或任务都被赋予了一个基于基准估计(BE)的暴露量,该暴露量源自公司职业卫生学家进行的基于任务的个人暴露评估。这些基准估计值对应于每项工作或任务的苯估计算术平均暴露量,并用于确定性算法中以估计研究对象的暴露量。结果发现,几乎所有作为基准估计值基础的数据集都包含一些低于采样和分析方法检测限(LOD)的值,并且有些数据集受到了严重审查;在某些数据集中,高达95%的数据低于检测限。因此,有必要采用一种考虑了审查数据的算术平均暴露量计算方法。为了为该研究中的特定数据选择最合适的方法,采用了三种不同的方法。一种常用的方法是用检测限的一半替换缺失(审查)值。对于大部分数据低于检测限或数据高度偏态(几何标准差为3或更大)的数据集,推荐使用这种方法。另一种方法是用检测限除以根号2来替换审查数据,当相对较少的数据低于检测限时或数据不太偏态时,推荐使用这种方法。研究的第三种方法是科恩方法。这涉及根据未审查数据的分布对分布的左尾进行数学外推,并计算算术平均值的最大似然估计值。当将这三种方法应用于本研究的数据时,发现前两种简单方法在大多数情况下给出了相似的结果。另一方面,科恩方法给出的结果通常(但不总是)高于简单方法,并且在某些情况下给出了极高甚至不合理的平均值估计。似乎如果数据与简单的对数正态分布有很大偏差,特别是如果存在高异常值,那么科恩方法会产生不稳定且不可靠的估计。在检查了这些结果以及审查数据的分布和比例后,决定在该特定研究中检测限一半方法最为合适。