Suppr超能文献

好时光与坏时光:利用机器学习对安大略省季节性隐孢子虫病进行自动预测

Good times bad times: Automated forecasting of seasonal cryptosporidiosis in Ontario using machine learning.

作者信息

Berke Olaf, Trotz-Williams Lise, de Montigny Simon

机构信息

Department of Population Medicine, Ontario Veterinary College, University of Guelph, Guelph, ON.

Wellington-Dufferin Guelph Public Health, Guelph, ON.

出版信息

Can Commun Dis Rep. 2020 Jun 4;46(6):192-197. doi: 10.14745/ccdr.v46i06a07.

Abstract

BACKGROUND

The rise of big data and related predictive modelling based on machine learning algorithms over the last two decades have provided new opportunities for disease surveillance and public health preparedness. Big data come with the promise of faster generation of and access to more precise information, potentially facilitating predictive precision in public health ("precision public health"). As an example, we considered forecasting of the future course of the monthly cryptosporidiosis incidence in Ontario.

METHODS

The traditional statistical approach to forecasting is the seasonal autoregressive integrated moving-average (SARIMA) model. We applied SARIMA and an artificial neural network (ANN) approach, specifically a feed-forward neural network, to predict monthly cryptosporidiosis incidence in Ontario in 2017 using 2005-2016 data as a training set. Both forecasting approaches are automated to make them relevant in a disease surveillance context. We compared the resulting forecasts using the root mean squared error (RMSE) and mean absolute error (MAE) as measures of predictive accuracy.

RESULTS

Cryptosporidiosis is a seasonal disease, which peaks in Ontario in late summer. In this study, the SARIMA model and ANN forecasting approaches captured the seasonal pattern of cryptosporidiosis well. Contrary to similar studies reported in the literature, the ANN forecasts of cryptosporidiosis were slightly less accurate than the SARIMA model forecasts.

CONCLUSION

The ANN and SARIMA approaches are suitable for automated forecasting of public health time series data from surveillance systems. Future studies should employ additional algorithms (e.g. random forests) and assess accuracy by using alternative diseases for case studies and conducting rigorous simulation studies. Difference between the forecasts from the machine learning algorithm, that is, the ANN, and the statistical learning model, that is, the SARIMA, should be considered with respect to philosophical differences between the two approaches.

摘要

背景

在过去二十年中,大数据的兴起以及基于机器学习算法的相关预测建模为疾病监测和公共卫生防范提供了新机遇。大数据有望更快地生成并获取更精确的信息,有可能提高公共卫生领域的预测精度(“精准公共卫生”)。例如,我们考虑对安大略省每月隐孢子虫病发病率的未来趋势进行预测。

方法

传统的预测统计方法是季节性自回归积分滑动平均(SARIMA)模型。我们应用SARIMA和人工神经网络(ANN)方法,具体为前馈神经网络,以2005 - 2016年的数据作为训练集,预测2017年安大略省每月的隐孢子虫病发病率。两种预测方法都实现了自动化,以便在疾病监测背景下具有实用性。我们使用均方根误差(RMSE)和平均绝对误差(MAE)作为预测准确性的度量,比较了所得的预测结果。

结果

隐孢子虫病是一种季节性疾病,在安大略省夏末达到高峰。在本研究中,SARIMA模型和ANN预测方法都很好地捕捉到了隐孢子虫病的季节性模式。与文献中报道的类似研究相反,隐孢子虫病的ANN预测比SARIMA模型预测的准确性略低。

结论

ANN和SARIMA方法适用于对监测系统中的公共卫生时间序列数据进行自动预测。未来的研究应采用其他算法(如随机森林),并通过使用替代疾病进行案例研究和开展严格的模拟研究来评估准确性。应从两种方法的哲学差异方面考虑机器学习算法(即ANN)和统计学习模型(即SARIMA)预测结果之间的差异。

相似文献

本文引用的文献

4
Big Data in Public Health: Terminology, Machine Learning, and Privacy.大数据在公共卫生中的应用:术语、机器学习和隐私
Annu Rev Public Health. 2018 Apr 1;39:95-112. doi: 10.1146/annurev-publhealth-040617-014208. Epub 2017 Dec 20.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验