Department of Epidemiology, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran.
Department of Biostatistics, School of Public Health, Hamadan University of Medical Sciences, Hamadan, Iran.
PLoS One. 2020 May 12;15(5):e0232910. doi: 10.1371/journal.pone.0232910. eCollection 2020.
The identification of statistical models for the accurate forecast and timely determination of the outbreak of infectious diseases is very important for the healthcare system. Thus, this study was conducted to assess and compare the performance of four machine-learning methods in modeling and forecasting brucellosis time series data based on climatic parameters.
In this cohort study, human brucellosis cases and climatic parameters were analyzed on a monthly basis for the Qazvin province-located in northwestern Iran- over a period of 9 years (2010-2018). The data were classified into two subsets of education (80%) and testing (20%). Artificial neural network methods (radial basis function and multilayer perceptron), support vector machine and random forest were fitted to each set. Performance analysis of the models were done using the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Root Error (MARE), and R2 criteria.
The incidence rate of the brucellosis in Qazvin province was 27.43 per 100,000 during 2010-2019. Based on our results, the values of the RMSE (0.22), MAE (0.175), MARE (0.007) criteria were smaller for the multilayer perceptron neural network than their values in the other three models. Moreover, the R2 (0.99) value was bigger in this model. Therefore, the multilayer perceptron neural network exhibited better performance in forecasting the studied data. The average wind speed and mean temperature were the most effective climatic parameters in the incidence of this disease.
The multilayer perceptron neural network can be used as an effective method in detecting the behavioral trend of brucellosis over time. Nevertheless, further studies focusing on the application and comparison of these methods are needed to detect the most appropriate forecast method for this disease.
准确预测和及时确定传染病的爆发对于医疗保健系统非常重要,因此,本研究旨在评估和比较四种机器学习方法在基于气候参数的布鲁氏菌病时间序列数据建模和预测中的性能。
在这项队列研究中,对伊朗西北部卡兹温省 9 年来(2010-2018 年)的人类布鲁氏菌病病例和气候参数进行了按月分析。数据分为教育(80%)和测试(20%)两个子集。人工神经网络方法(径向基函数和多层感知器)、支持向量机和随机森林分别应用于每个子集。使用均方根误差(RMSE)、平均绝对误差(MAE)、平均绝对根误差(MARE)和 R2 标准对模型的性能进行分析。
2010-2019 年期间,卡兹温省布鲁氏菌病的发病率为每 10 万人 27.43 例。根据我们的结果,多层感知器神经网络的 RMSE(0.22)、MAE(0.175)、MARE(0.007)标准值小于其他三种模型的值。此外,该模型的 R2(0.99)值更大。因此,多层感知器神经网络在预测研究数据方面表现出更好的性能。平均风速和平均温度是该病发病率的最有效气候参数。
多层感知器神经网络可用于检测布鲁氏菌病随时间的行为趋势的有效方法。然而,需要进一步研究这些方法的应用和比较,以检测最适合该疾病的预测方法。