Goncalves Andre, Cadena Jose, Hu Yeping, Schlessinger David, Greene John, O'suilleabhain Liam, Clancy Heather, Vollmer Michael, Liu Vincent, Bates Tom, Ray Priyadip
Lawrence Livermore National Laboratory.
Kaiser Permanente Division of Research.
Res Sq. 2025 Jun 12:rs.3.rs-6606632. doi: 10.21203/rs.3.rs-6606632/v1.
Detecting infectious disease outbreaks promptly is crucial for effective public health responses, minimizing transmission, and enabling critical interventions. This study introduces a method that integrates machine learning (ML)-based diagnostic predictions with traditional epidemiological surveillance to enhance biosurveillance systems. Using 4.5 million patient records from 2010 to 2022, ML models were trained to predict, within 24-hour intervals, the likelihood of patients being diagnosed with infectious or unspecified gastrointestinal, respiratory, or neurological diseases. High-confidence predictions were combined with final diagnoses and analyzed using spatiotemporal outbreak detection techniques. Among diseases with five or more outbreaks between 2014 and 2022, 33.3% (41 of 123 outbreaks) were detected earlier, with lead times ranging from 1 to 24 days and an average of 1.33 false positive outbreaks detected annually. This approach demonstrates the potential of integrating ML with conventional methods for faster outbreak detection, provided adequate disease-specific training data is available.
及时发现传染病暴发对于有效的公共卫生应对、减少传播以及实施关键干预措施至关重要。本研究介绍了一种将基于机器学习(ML)的诊断预测与传统流行病学监测相结合的方法,以增强生物监测系统。利用2010年至2022年的450万份患者记录,训练ML模型以24小时间隔预测患者被诊断患有传染性或未指明的胃肠道、呼吸道或神经系统疾病的可能性。高置信度预测与最终诊断相结合,并使用时空暴发检测技术进行分析。在2014年至2022年期间发生五次或更多次暴发的疾病中,33.3%(123次暴发中的41次)被更早检测到,提前时间从1天到24天不等,每年平均检测到1.33次假阳性暴发。该方法证明了将ML与传统方法相结合以更快检测暴发的潜力,前提是有足够的针对特定疾病的训练数据。