Center for Animal Disease Modeling and Surveillance (CADMS), Department of Medicine & Epidemiology, School of Veterinary Medicine, University of California, Davis, USA.
Departamento de Patoloxía Animal, Facultade de Veterinaria de Lugo, Universidade de Santiago de Compostela, Lugo, Spain.
Vet Res. 2023 Sep 8;54(1):75. doi: 10.1186/s13567-023-01197-3.
Anomaly detection methods have a great potential to assist the detection of diseases in animal production systems. We used sequence data of Porcine Reproductive and Respiratory Syndrome (PRRS) to define the emergence of new strains at the farm level. We evaluated the performance of 24 anomaly detection methods based on machine learning, regression, time series techniques and control charts to identify outbreaks in time series of new strains and compared the best methods using different time series: PCR positives, PCR requests and laboratory requests. We introduced synthetic outbreaks of different size and calculated the probability of detection of outbreaks (POD), sensitivity (Se), probability of detection of outbreaks in the first week of appearance (POD1w) and background alarm rate (BAR). The use of time series of new strains from sequence data outperformed the other types of data but POD, Se, POD1w were only high when outbreaks were large. The methods based on Long Short-Term Memory (LSTM) and Bayesian approaches presented the best performance. Using anomaly detection methods with sequence data may help to identify the emergency of cases in multiple farms, but more work is required to improve the detection with time series of high variability. Our results suggest a promising application of sequence data for early detection of diseases at a production system level. This may provide a simple way to extract additional value from routine laboratory analysis. Next steps should include validation of this approach in different settings and with different diseases.
异常检测方法在动物生产系统疾病检测中具有很大的应用潜力。我们使用猪繁殖与呼吸综合征(PRRS)的序列数据来定义农场层面新菌株的出现。我们评估了基于机器学习、回归、时间序列技术和控制图的 24 种异常检测方法的性能,以识别新菌株时间序列中的暴发,并使用不同的时间序列比较最佳方法:PCR 阳性、PCR 请求和实验室请求。我们引入了不同大小的合成暴发,并计算了暴发的检测概率(POD)、灵敏度(Se)、出现第一周的暴发检测概率(POD1w)和背景报警率(BAR)。使用来自序列数据的新菌株时间序列的方法优于其他类型的数据,但只有在暴发较大时,POD、Se、POD1w 才较高。基于长短期记忆(LSTM)和贝叶斯方法的方法表现出最佳性能。使用具有序列数据的异常检测方法可以帮助识别多个农场的紧急情况,但需要进一步的工作来提高具有高可变性的时间序列的检测能力。我们的研究结果表明,序列数据在生产系统层面早期检测疾病具有很大的应用潜力。这可能为从常规实验室分析中提取额外价值提供一种简单的方法。下一步应包括在不同环境和不同疾病中验证这种方法。