利用真实疫情构建测试数据以评估检测算法。

Building test data from real outbreaks for evaluating detection algorithms.

作者信息

Texier Gaetan, Jackson Michael L, Siwe Leonel, Meynard Jean-Baptiste, Deparis Xavier, Chaudet Herve

机构信息

Pasteur Center in Cameroun, Yaoundé, Cameroun.

UMR 912 / SESSTIM - INSERM/IRD/Aix-Marseille University / Faculty of Medicine - 27, Bd Jean Moulin, Marseille, France.

出版信息

PLoS One. 2017 Sep 1;12(9):e0183992. doi: 10.1371/journal.pone.0183992. eCollection 2017.

DOI:10.1371/journal.pone.0183992

PMID:28863159

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5593515/

Abstract

Benchmarking surveillance systems requires realistic simulations of disease outbreaks. However, obtaining these data in sufficient quantity, with a realistic shape and covering a sufficient range of agents, size and duration, is known to be very difficult. The dataset of outbreak signals generated should reflect the likely distribution of authentic situations faced by the surveillance system, including very unlikely outbreak signals. We propose and evaluate a new approach based on the use of historical outbreak data to simulate tailored outbreak signals. The method relies on a homothetic transformation of the historical distribution followed by resampling processes (Binomial, Inverse Transform Sampling Method-ITSM, Metropolis-Hasting Random Walk, Metropolis-Hasting Independent, Gibbs Sampler, Hybrid Gibbs Sampler). We carried out an analysis to identify the most important input parameters for simulation quality and to evaluate performance for each of the resampling algorithms. Our analysis confirms the influence of the type of algorithm used and simulation parameters (i.e. days, number of cases, outbreak shape, overall scale factor) on the results. We show that, regardless of the outbreaks, algorithms and metrics chosen for the evaluation, simulation quality decreased with the increase in the number of days simulated and increased with the number of cases simulated. Simulating outbreaks with fewer cases than days of duration (i.e. overall scale factor less than 1) resulted in an important loss of information during the simulation. We found that Gibbs sampling with a shrinkage procedure provides a good balance between accuracy and data dependency. If dependency is of little importance, binomial and ITSM methods are accurate. Given the constraint of keeping the simulation within a range of plausible epidemiological curves faced by the surveillance system, our study confirms that our approach can be used to generate a large spectrum of outbreak signals.

摘要

对监测系统进行基准测试需要对疾病爆发进行逼真的模拟。然而，要获得足够数量、具有逼真形状且涵盖足够范围的病原体、规模和持续时间的数据非常困难。生成的爆发信号数据集应反映监测系统可能面临的真实情况的分布，包括极不可能出现的爆发信号。我们提出并评估了一种基于使用历史爆发数据来模拟定制爆发信号的新方法。该方法依赖于对历史分布进行相似变换，然后进行重采样过程（二项式、逆变换采样方法 - ITSM、梅特罗波利斯 - 黑斯廷斯随机游走、梅特罗波利斯 - 黑斯廷斯独立采样、吉布斯采样器、混合吉布斯采样器）。我们进行了一项分析，以确定模拟质量最重要的输入参数，并评估每种重采样算法的性能。我们的分析证实了所使用算法的类型和模拟参数（即天数、病例数、爆发形状、总体比例因子）对结果的影响。我们表明，无论选择何种爆发情况、算法和评估指标，模拟质量都会随着模拟天数的增加而下降，随着模拟病例数的增加而提高。模拟病例数少于持续天数（即总体比例因子小于 1）的爆发会导致模拟过程中信息的大量损失。我们发现采用收缩程序的吉布斯采样在准确性和数据依赖性之间提供了良好的平衡。如果依赖性不太重要，二项式和 ITSM 方法是准确的。鉴于将模拟保持在监测系统可能面临的合理流行病学曲线范围内的限制，我们的研究证实我们的方法可用于生成大量的爆发信号。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7a9e/5593515/eac05c647762/pone.0183992.g001.jpg

相似文献

Building test data from real outbreaks for evaluating detection algorithms.利用真实疫情构建测试数据以评估检测算法。

PLoS One. 2017 Sep 1;12(9):e0183992. doi: 10.1371/journal.pone.0183992. eCollection 2017.

Measuring outbreak-detection performance by using controlled feature set simulations.通过使用受控特征集模拟来衡量疫情检测性能。

MMWR Suppl. 2004 Sep 24;53:130-6.

Quantifying the determinants of outbreak detection performance through simulation and machine learning.通过模拟和机器学习对疫情检测性能的决定因素进行量化。

J Biomed Inform. 2015 Feb;53:180-7. doi: 10.1016/j.jbi.2014.10.009. Epub 2014 Nov 6.

A simulation study comparing aberration detection algorithms for syndromic surveillance.一项比较用于症状监测的像差检测算法的模拟研究。

BMC Med Inform Decis Mak. 2007 Mar 1;7:6. doi: 10.1186/1472-6947-7-6.

A software tool for creating simulated outbreaks to benchmark surveillance systems.一种用于创建模拟疫情以对监测系统进行基准测试的软件工具。

BMC Med Inform Decis Mak. 2005 Jul 14;5:22. doi: 10.1186/1472-6947-5-22.

Using GIS to create synthetic disease outbreaks.利用地理信息系统创建模拟疾病爆发。

BMC Med Inform Decis Mak. 2007 Feb 14;7:4. doi: 10.1186/1472-6947-7-4.

Simulation-Based Evaluation of the Performances of an Algorithm for Detecting Abnormal Disease-Related Features in Cattle Mortality Records.基于模拟的牛死亡率记录中疾病相关异常特征检测算法性能评估

PLoS One. 2015 Nov 4;10(11):e0141273. doi: 10.1371/journal.pone.0141273. eCollection 2015.

Template-driven spatial-temporal outbreak simulation for outbreak detection evaluation.用于疫情检测评估的模板驱动时空疫情模拟

AMIA Annu Symp Proc. 2008 Nov 6;2008:854-8.

Bio-ALIRT biosurveillance detection algorithm evaluation.生物-ALIRT生物监测检测算法评估

MMWR Suppl. 2004 Sep 24;53:152-8.

Integrating syndromic surveillance data across multiple locations: effects on outbreak detection performance.整合多个地点的症状监测数据：对疫情检测性能的影响。

AMIA Annu Symp Proc. 2003;2003:549-53.

引用本文的文献

Synthetic data in health care: A narrative review.医疗保健中的合成数据：一篇叙述性综述。

PLOS Digit Health. 2023 Jan 6;2(1):e0000082. doi: 10.1371/journal.pdig.0000082. eCollection 2023 Jan.

Cluster detection with random neighbourhood covering: Application to invasive Group A Streptococcal disease.基于随机邻域覆盖的聚类检测：在侵袭性 A 组链球菌病中的应用。

PLoS Comput Biol. 2022 Nov 30;18(11):e1010726. doi: 10.1371/journal.pcbi.1010726. eCollection 2022 Nov.

Estimating the epidemic growth dynamics within the first week.估算第一周内的疫情增长动态。

Heliyon. 2021 Nov;7(11):e08422. doi: 10.1016/j.heliyon.2021.e08422. Epub 2021 Nov 18.

Using decision fusion methods to improve outbreak detection in disease surveillance.利用决策融合方法提高疾病监测中的疫情检测。

BMC Med Inform Decis Mak. 2019 Mar 5;19(1):38. doi: 10.1186/s12911-019-0774-3.

Comparison of statistical algorithms for daily syndromic surveillance aberration detection.用于日常症状监测异常检测的统计算法比较

Bioinformatics. 2019 Sep 1;35(17):3110-3118. doi: 10.1093/bioinformatics/bty997.

本文引用的文献

Comparison of Statistical Algorithms for the Detection of Infectious Disease Outbreaks in Large Multiple Surveillance Systems.大型多重监测系统中传染病暴发检测统计算法的比较

PLoS One. 2016 Aug 11;11(8):e0160759. doi: 10.1371/journal.pone.0160759. eCollection 2016.

Modelling in infectious diseases: between haphazard and hazard.传染病建模：从偶然到危险。

Clin Microbiol Infect. 2013 Nov;19(11):993-8. doi: 10.1111/1469-0691.12309. Epub 2013 Jul 23.

A Simulation Optimization Approach to Epidemic Forecasting.一种用于疫情预测的模拟优化方法。

PLoS One. 2013 Jun 27;8(6):e67164. doi: 10.1371/journal.pone.0067164. Print 2013.

in silico surveillance: evaluating outbreak detection with simulation models.基于模型的监测：利用模拟模型评估暴发检测。

BMC Med Inform Decis Mak. 2013 Jan 23;13:12. doi: 10.1186/1472-6947-13-12.

Molecular, epidemiological, and clinical complexities of predicting patterns of infectious diseases.预测传染病模式的分子、流行病学及临床复杂性。

Front Microbiol. 2011 Feb 11;2:25. doi: 10.3389/fmicb.2011.00025. eCollection 2011.

Microbe interactions undermine predictions.微生物间的相互作用破坏了预测结果。

Science. 2011 Jan 14;331(6014):144-5; author reply 145-7. doi: 10.1126/science.331.6014.144-c.

Deterministic SIR (Susceptible-Infected-Removed) models applied to varicella outbreaks.应用于水痘爆发的确定性易感-感染-康复（SIR）模型。

Epidemiol Infect. 2008 May;136(5):679-87. doi: 10.1017/S0950268807009260. Epub 2007 Jul 26.

An epidemiological network model for disease outbreak detection.一种用于疾病爆发检测的流行病学网络模型。

PLoS Med. 2007 Jun;4(6):e210. doi: 10.1371/journal.pmed.0040210.

An outbreak of dengue virus serotype 1 infection in Cixi, Ningbo, People's Republic of China, 2004, associated with a traveler from Thailand and high density of Aedes albopictus.2004年，中国浙江省宁波市慈溪市发生登革病毒1型感染疫情，此次疫情与一名来自泰国的旅行者以及高密度的白纹伊蚊有关。

Am J Trop Med Hyg. 2007 Jun;76(6):1182-8.

Early efforts in modeling the incubation period of infectious diseases with an acute course of illness.早期对具有急性病程的传染病潜伏期进行建模的努力。

Emerg Themes Epidemiol. 2007 May 11;4:2. doi: 10.1186/1742-7622-4-2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用真实疫情构建测试数据以评估检测算法。

Building test data from real outbreaks for evaluating detection algorithms.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献