Siegrist David, Pavlin J
Potomac Institute for Policy Studies, 901 N. Stuart Street, Suite 200, Arlington, Virginia 22203, USA.
MMWR Suppl. 2004 Sep 24;53:152-8.
Early detection of disease outbreaks by a medical biosurveillance system relies on two major components: 1) the contribution of early and reliable data sources and 2) the sensitivity, specificity, and timeliness of biosurveillance detection algorithms. This paper describes an effort to assess leading detection algorithms by arranging a common challenge problem and providing a common data set.
The objectives of this study were to determine whether automated detection algorithms can reliably and quickly identify the onset of natural disease outbreaks that are surrogates for possible terrorist pathogen releases, and do so at acceptable false-alert rates (e.g., once every 2-6 weeks).
Historic de-identified data were obtained from five metropolitan areas over 23 months; these data included International Classification of Diseases, Ninth Revision (ICD-9) codes related to respiratory and gastrointestinal illness syndromes. An outbreak detection group identified and labeled two natural disease outbreaks in these data and provided them to analysts for training of detection algorithms. All outbreaks in the remaining test data were identified but not revealed to the detection groups until after their analyses. The algorithms established a probability of outbreak for each day's counts. The probability of outbreak was assessed as an "actual" alert for different false-alert rates.
The best algorithms were able to detect all of the outbreaks at false-alert rates of one every 2-6 weeks. They were often able to detect for the same day human investigators had identified as the true start of the outbreak.
Because minimal data exists for an actual biologic attack, determining how quickly an algorithm might detect such an attack is difficult. However, application of these algorithms in combination with other data-analysis methods to historic outbreak data indicates that biosurveillance techniques for analyzing syndrome counts can rapidly detect seasonal respiratory and gastrointestinal illness outbreaks. Further research is needed to assess the value of electronic data sources for predictive detection. In addition, simulations need to be developed and implemented to better characterize the size and type of biologic attack that can be detected by current methods by challenging them under different projected operational conditions.
医疗生物监测系统对疾病爆发的早期检测依赖于两个主要组成部分:1)早期且可靠的数据来源的贡献,以及2)生物监测检测算法的敏感性、特异性和及时性。本文描述了通过安排一个常见的挑战问题并提供一个通用数据集来评估领先检测算法的工作。
本研究的目标是确定自动化检测算法是否能够可靠且快速地识别作为可能的恐怖病原体释放替代物的自然疾病爆发的开始,并以可接受的误报率(例如,每2 - 6周一次)做到这一点。
从五个大都市地区在23个月内获取了去识别化的历史数据;这些数据包括与呼吸道和胃肠道疾病综合征相关的国际疾病分类第九版(ICD - 9)编码。一个爆发检测小组在这些数据中识别并标记了两起自然疾病爆发,并将它们提供给分析人员用于检测算法的训练。其余测试数据中的所有爆发都被识别出来,但在分析完成之前不会向检测小组透露。算法为每天的计数建立爆发概率。爆发概率被评估为针对不同误报率的“实际”警报。
最佳算法能够以每2 - 6周一次的误报率检测到所有爆发。它们通常能够在人类调查人员确定为爆发真正开始的同一天检测到。
由于实际生物攻击的可用数据极少,很难确定算法能多快检测到这种攻击。然而,将这些算法与其他数据分析方法结合应用于历史爆发数据表明,用于分析综合征计数的生物监测技术可以快速检测季节性呼吸道和胃肠道疾病爆发。需要进一步研究来评估电子数据源在预测检测方面的价值。此外,需要开发并实施模拟,以通过在不同预计运行条件下对当前方法进行挑战,更好地描述当前方法能够检测到的生物攻击的规模和类型。