Istituto Zooprofilattico Sperimentale del Piemonte, Liguria e Valle d'Aosta, Via Bologna 148, 10154, Turin, Italy.
Tecnológico de Monterrey, Av. Eugenio Garza Sada 2501 Sur, Tecnológico, 64849, Monterrey, N.L., México.
Vet Res. 2024 Jun 5;55(1):72. doi: 10.1186/s13567-024-01323-9.
Salmonellosis, one of the most common foodborne infections in Europe, is monitored by food safety surveillance programmes, resulting in the generation of extensive databases. By leveraging tree-based machine learning (ML) algorithms, we exploited data from food safety audits to predict spatiotemporal patterns of salmonellosis in northwestern Italy. Data on human cases confirmed in 2015-2018 (n = 1969) and food surveillance data collected in 2014-2018 were used to develop ML algorithms. We integrated the monthly municipal human incidence with 27 potential predictors, including the observed prevalence of Salmonella in food. We applied the tree regression, random forest and gradient boosting algorithms considering different scenarios and evaluated their predictivity in terms of the mean absolute percentage error (MAPE) and R. Using a similar dataset from the year 2019, spatiotemporal predictions and their relative sensitivities and specificities were obtained. Random forest and gradient boosting (R = 0.55, MAPE = 7.5%) outperformed the tree regression algorithm (R = 0.42, MAPE = 8.8%). Salmonella prevalence in food; spatial features; and monitoring efforts in ready-to-eat milk, fruits and vegetables, and pig meat products contributed the most to the models' predictivity, reducing the variance by 90.5%. Conversely, the number of positive samples obtained for specific food matrices minimally influenced the predictions (2.9%). Spatiotemporal predictions for 2019 showed sensitivity and specificity levels of 46.5% (due to the lack of some infection hotspots) and 78.5%, respectively. This study demonstrates the added value of integrating data from human and veterinary health services to develop predictive models of human salmonellosis occurrence, providing early warnings useful for mitigating foodborne disease impacts on public health.
沙门氏菌病是欧洲最常见的食源性感染之一,通过食品安全监测计划进行监测,由此产生了广泛的数据库。我们利用基于树的机器学习(ML)算法,利用食品安全审计数据来预测意大利西北部沙门氏菌病的时空模式。使用 2015-2018 年确诊的人类病例数据(n=1969)和 2014-2018 年收集的食品监测数据来开发 ML 算法。我们将每月的市级人类发病率与 27 个潜在预测因子(包括食品中观察到的沙门氏菌流行率)相结合。我们应用了树回归、随机森林和梯度提升算法,考虑了不同的情况,并根据平均绝对百分比误差(MAPE)和 R 评估了它们的预测能力。使用 2019 年的类似数据集,获得了时空预测及其相对敏感性和特异性。随机森林和梯度提升(R=0.55,MAPE=7.5%)优于树回归算法(R=0.42,MAPE=8.8%)。食品中沙门氏菌流行率;空间特征;以及即食牛奶、水果和蔬菜以及猪肉产品的监测工作对模型的预测能力贡献最大,减少了 90.5%的方差。相反,特定食品矩阵中获得的阳性样本数量对预测的影响最小(2.9%)。2019 年的时空预测显示敏感性和特异性水平分别为 46.5%(由于缺乏一些感染热点)和 78.5%。本研究表明,整合人类和兽医卫生服务的数据以开发人类沙门氏菌病发生的预测模型具有附加值,为减轻食源性疾病对公共卫生的影响提供了有用的预警。