Department of Epidemiology, Center for Global Health, School of Public Health, Nanjing Medical University, 101 Longmian Ave., Nanjing, 211166, China.
Department of Tuberculosis, The Third Hospital of Zhenjiang City, Zhenjiang, 212005, China.
Infect Dis Poverty. 2020 Nov 5;9(1):151. doi: 10.1186/s40249-020-00771-7.
Many studies have compared the performance of time series models in predicting pulmonary tuberculosis (PTB), but few have considered the role of meteorological factors in their prediction models. This study aims to explore whether incorporating meteorological factors can improve the performance of time series models in predicting PTB.
We collected the monthly reported number of PTB cases and records of six meteorological factors in three cities of China from 2005 to 2018. Based on this data, we constructed three time series models, including an autoregressive integrated moving average (ARIMA) model, the ARIMA with exogenous variables (ARIMAX) model, and a recurrent neural network (RNN) model. The ARIMAX and RNN models incorporated meteorological factors, while the ARIMA model did not. The mean absolute percentage error (MAPE) and root mean square error (RMSE) were used to evaluate the performance of the models in predicting PTB cases in 2018.
Both the cross-correlation analysis and Spearman rank correlation test showed that PTB cases reported in the study areas were related to meteorological factors. The predictive performance of both the ARIMA and RNN models was improved after incorporating meteorological factors. The MAPEs of the ARIMA, ARIMAX, and RNN models were 12.54%, 11.96%, and 12.36% in Xuzhou, 15.57%, 11.16%, and 14.09% in Nantong, and 9.70%, 9.66%, and 12.50% in Wuxi, respectively. The RMSEs of the three models were 36.194, 33.956, and 34.785 in Xuzhou, 34.073, 25.884, and 31.828 in Nantong, and 19.545, 19.026, and 26.019 in Wuxi, respectively.
Our study revealed a possible link between PTB and meteorological factors. Taking meteorological factors into consideration increased the accuracy of time series models in predicting PTB, and the ARIMAX model was superior to the ARIMA and RNN models in study settings.
许多研究比较了时间序列模型在预测肺结核(PTB)方面的性能,但很少有研究考虑气象因素在其预测模型中的作用。本研究旨在探讨纳入气象因素是否可以提高时间序列模型预测 PTB 的性能。
我们收集了 2005 年至 2018 年中国三个城市的每月报告的肺结核病例数和 6 项气象因素记录。基于这些数据,我们构建了三个时间序列模型,包括自回归综合移动平均(ARIMA)模型、带外生变量的 ARIMA(ARIMAX)模型和递归神经网络(RNN)模型。ARIMAX 和 RNN 模型纳入了气象因素,而 ARIMA 模型则没有。平均绝对百分比误差(MAPE)和均方根误差(RMSE)用于评估模型在预测 2018 年 PTB 病例中的性能。
交叉相关分析和 Spearman 秩相关检验均表明,研究区域报告的肺结核病例与气象因素有关。在纳入气象因素后,ARIMA 和 RNN 模型的预测性能均得到提高。ARIMA、ARIMAX 和 RNN 模型在徐州的 MAPE 分别为 12.54%、11.96%和 12.36%,在南通的 MAPE 分别为 15.57%、11.16%和 14.09%,在无锡的 MAPE 分别为 9.70%、9.66%和 12.50%。三个模型在徐州的 RMSE 分别为 36.194、33.956 和 34.785,在南通的 RMSE 分别为 34.073、25.884 和 31.828,在无锡的 RMSE 分别为 19.545、19.026 和 26.019。
本研究揭示了肺结核与气象因素之间可能存在的联系。考虑气象因素可以提高时间序列模型预测肺结核的准确性,在研究环境中,ARIMAX 模型优于 ARIMA 和 RNN 模型。