Kleinman Ken, Abrams Allyson, Katherine Yih W, Platt Richard, Kulldorff Martin
Department of Ambulatory Care and Prevention, Harvard Medical School and Harvard Pilgrim Health Care, USA.
Stat Med. 2006 Mar 15;25(5):755-69. doi: 10.1002/sim.2402.
Since the anthrax attacks of October 2001 and the SARS outbreaks of recent years, there has been an increasing interest in developing surveillance systems to aid in the early detection of such illness. Systems have been established which do this is by monitoring primary health-care visits, pharmacy sales, absenteeism records, and other non-traditional sources of data. While many resources have been invested in establishing such systems, relatively little effort has as yet been expended in evaluating their performance. One way to evaluate a given surveillance system is to compare the signals it generates with known outbreaks identified in other systems. In public health practice, for example, public health departments investigate reports of illness and sometimes track hospital admissions. Comparison of new systems with extant systems cannot generate estimates of test characteristics such as sensitivity and specificity, since the actual number of positives and negatives cannot be known. However, the comparison can reveal whether a new or proposed system's signals match outbreaks detected by the existing system. This could help support or reject the new system as an alternative or complement to the extant system. We propose three methods to test the null hypothesis that the new system does not signal true outbreaks more often than would be expected by chance. The methods differ in the restrictiveness of the assumptions required. Each test may detect weaknesses in the new system, depending on the distribution of outbreaks and can be used to construct confidence limits on the agreement between the new system's signals and the outbreaks, given the distribution of the signals. They can be used to assess whether the new system works in that it detects the outbreaks better than chance would suggest and can also determine if the new systems' signals are generated earlier than an extant system.
自2001年10月的炭疽袭击事件以及近年来的非典疫情爆发以来,人们对开发监测系统以协助早期发现此类疾病的兴趣与日俱增。已经建立了一些系统,这些系统通过监测初级卫生保健就诊情况、药房销售情况、缺勤记录以及其他非传统数据来源来实现这一目的。虽然已经投入了大量资源来建立此类系统,但在评估其性能方面所花费的精力相对较少。评估一个给定监测系统的一种方法是将它所产生的信号与其他系统中已确认的疫情进行比较。例如,在公共卫生实践中,公共卫生部门会调查疾病报告,有时还会追踪医院入院情况。将新系统与现有系统进行比较无法得出诸如灵敏度和特异性等测试特征的估计值,因为无法得知实际的阳性和阴性数量。然而,这种比较可以揭示新系统或提议系统的信号是否与现有系统检测到的疫情相匹配。这有助于支持或否定新系统作为现有系统的替代或补充。我们提出了三种方法来检验零假设,即新系统发出真正疫情信号的频率并不比偶然预期的更高。这些方法在所需假设的严格程度上有所不同。每个测试可能会发现新系统中的弱点,这取决于疫情的分布情况,并且可以根据信号的分布情况,用于构建新系统的信号与疫情之间一致性的置信区间。它们可用于评估新系统是否有效,即它检测疫情的能力是否优于偶然预期,还可以确定新系统的信号是否比现有系统更早产生。