Phillips Carl V, LaPole Luwanna M
Management and Policy Sciences, University of Texas School of Public Health and Center for Clinical Research and Evidence Based Medicine, University of Texas Medical School, Houston, Texas, USA.
BMC Med Res Methodol. 2003 Jun 12;3:9. doi: 10.1186/1471-2288-3-9.
All quantifications of mortality, morbidity, and other health measures involve numerous sources of error. The routine quantification of random sampling error makes it easy to forget that other sources of error can and should be quantified. When a quantification does not involve sampling, error is almost never quantified and results are often reported in ways that dramatically overstate their precision.
We argue that the precision implicit in typical reporting is problematic and sketch methods for quantifying the various sources of error, building up from simple examples that can be solved analytically to more complex cases. There are straightforward ways to partially quantify the uncertainty surrounding a parameter that is not characterized by random sampling, such as limiting reported significant figures. We present simple methods for doing such quantifications, and for incorporating them into calculations. More complicated methods become necessary when multiple sources of uncertainty must be combined. We demonstrate that Monte Carlo simulation, using available software, can estimate the uncertainty resulting from complicated calculations with many sources of uncertainty. We apply the method to the current estimate of the annual incidence of foodborne illness in the United States.
Quantifying uncertainty from systematic errors is practical. Reporting this uncertainty would more honestly represent study results, help show the probability that estimated values fall within some critical range, and facilitate better targeting of further research.
所有关于死亡率、发病率及其他健康指标的量化都涉及众多误差来源。对随机抽样误差进行常规量化很容易让人忘记其他误差来源也能够且应该被量化。当一项量化不涉及抽样时,误差几乎从未被量化,而且结果的报告方式常常极大地夸大了其精确性。
我们认为典型报告中隐含的精确性存在问题,并概述了量化各种误差来源的方法,从可通过解析求解的简单示例逐步到更复杂的情况。对于并非由随机抽样所表征的参数,存在直接的方法来部分量化其周围的不确定性,例如限制报告的有效数字。我们介绍了进行此类量化以及将其纳入计算的简单方法。当必须合并多个不确定性来源时,就需要更复杂的方法。我们证明,使用现有软件进行蒙特卡洛模拟能够估计由具有多个不确定性来源的复杂计算所导致的不确定性。我们将该方法应用于美国食源性疾病年发病率的当前估计值。
量化系统误差导致的不确定性是可行的。报告这种不确定性将更诚实地呈现研究结果,有助于显示估计值落在某个关键范围内的概率,并促进更有针对性地开展进一步研究。