Walker Katherine D, Catalano Paul, Hammitt James K, Evans John S
Department of Environmental Health, Harvard School of Public Health, Boston, Massachusetts 02115, USA.
J Expo Anal Environ Epidemiol. 2003 Jan;13(1):1-16. doi: 10.1038/sj.jea.7500253.
The recent movement of regulatory agencies toward probabilistic analyses of human health and environmental risks has focused greater attention on the quality of the estimates of variability and uncertainty that underlie them. Of particular concern is how uncertainty--a measure of what is not known--is characterized, as uncertainty can play an influential role in analyses of the need for regulatory controls or in estimates of the economic value of additional research. This paper reports the second phase of a study, conducted as an element of the National Human Exposure Assessment Survey (NHEXAS), to obtain and calibrate exposure assessment experts judgments about uncertainty in residential ambient, residential indoor, and personal air benzene concentrations experienced by the nonsmoking, nonoccupationally exposed population in U.S. EPA's Region V. Subjective judgments (i.e., the median, interquartile range, and 90% confidence interval) about the means and 90th percentiles of each of the benzene distributions were elicited from the seven experts participating in the study. The calibration or quality of the experts' judgments was assessed by comparing them to the actual measurements from the NHEXAS Region V study using graphical techniques, a quadratic scoring rule, and surprise and interquartile indices. The results from both quantitative scoring methods suggested that, considered collectively, the experts' judgments were relatively well calibrated although on balance, underconfident. The calibration of individual expert judgments appeared variable, highlighting potential pitfalls in reliance on individual experts. In a surprising finding, the experts' judgments about the 90th percentiles of the benzene distributions were better calibrated than their predictions about the means; the experts tended to be overconfident in their ability to predict the means. This paper is also one of the first calibration studies to demonstrate the importance of taking into account intraexpert correlation on the statistical significance of the findings. When the judgments were assumed to be independent, analysis of the surprise and interquartile indices found evidence of poor calibration (P<0.05). However, when the intraexpert correlation in the study was taken into account, these findings were no longer statistically significant. The analysis further found that the experts' judgments scored better than estimates of Region V benzene concentrations simply drawn from earlier studies of ambient, indoor and personal benzene levels in other U.S. cities. These results suggest the value of careful elicitation of expert judgments in characterizing exposures in probabilistic form. Additional calibration studies need to be undertaken to corroborate and extend these findings.
监管机构最近在人类健康和环境风险概率分析方面的动向,使人们更加关注这些分析所依据的变异性和不确定性估计的质量。特别令人关注的是不确定性(一种对未知事物的度量)是如何被描述的,因为不确定性在监管控制需求分析或额外研究的经济价值估计中可能发挥重要作用。本文报告了一项研究的第二阶段,该研究是作为国家人类暴露评估调查(NHEXAS)的一部分进行的,目的是获取并校准暴露评估专家对美国环境保护局第五区域非吸烟、非职业暴露人群所经历的住宅环境、住宅室内和个人空气中苯浓度不确定性的判断。从参与研究的七位专家那里得到了关于每种苯分布的均值和第90百分位数的主观判断(即中位数、四分位间距和90%置信区间)。通过使用图形技术、二次评分规则以及意外和四分位指数,将专家的判断与NHEXAS第五区域研究的实际测量值进行比较,从而评估专家判断的校准情况或质量。两种定量评分方法的结果都表明,总体而言,专家的判断校准得相对较好,不过总体上信心不足。个别专家判断的校准情况似乎各不相同,这凸显了依赖个别专家存在的潜在问题。一个令人惊讶的发现是,专家对苯分布第90百分位数的判断比对均值的预测校准得更好;专家们往往对自己预测均值的能力过于自信。本文也是首批证明考虑专家内部相关性对研究结果统计显著性重要性的校准研究之一。当假设判断是独立的时候,对意外和四分位指数的分析发现了校准不佳的证据(P<0.05)。然而,当考虑到研究中的专家内部相关性时,这些发现不再具有统计学显著性。分析还发现,专家的判断得分优于仅从美国其他城市早期环境、室内和个人苯水平研究得出的第五区域苯浓度估计值。这些结果表明,以概率形式描述暴露时,仔细获取专家判断具有价值。需要进行更多的校准研究来证实和扩展这些发现。