Department of Preventive Medicine, Shantou University Medical College, No. 22 Xinling Road, Shantou, Guangdong, 515041, People's Republic of China.
Sci Rep. 2017 Apr 19;7:46469. doi: 10.1038/srep46469.
Seasonal influenza epidemics cause serious public health problems in China. Search queries-based surveillance was recently proposed to complement traditional monitoring approaches of influenza epidemics. However, developing robust techniques of search query selection and enhancing predictability for influenza epidemics remains a challenge. This study aimed to develop a novel ensemble framework to improve penalized regression models for detecting influenza epidemics by using Baidu search engine query data from China. The ensemble framework applied a combination of bootstrap aggregating (bagging) and rank aggregation method to optimize penalized regression models. Different algorithms including lasso, ridge, elastic net and the algorithms in the proposed ensemble framework were compared by using Baidu search engine queries. Most of the selected search terms captured the peaks and troughs of the time series curves of influenza cases. The predictability of the conventional penalized regression models were improved by the proposed ensemble framework. The elastic net regression model outperformed the compared models, with the minimum prediction errors. We established a Baidu search engine queries-based surveillance model for monitoring influenza epidemics, and the proposed model provides a useful tool to support the public health response to influenza and other infectious diseases.
季节性流感疫情在中国造成严重的公共卫生问题。基于搜索查询的监测最近被提议作为流感疫情传统监测方法的补充。然而,开发稳健的搜索查询选择技术并提高流感疫情的可预测性仍然是一个挑战。本研究旨在开发一种新的集成框架,通过使用来自中国的百度搜索引擎查询数据,改进基于惩罚回归模型的流感疫情检测。该集成框架应用了自助聚合(bagging)和排序聚合方法的组合,以优化惩罚回归模型。通过使用百度搜索引擎查询,比较了包括lasso、ridge、elastic net 和所提出的集成框架中的算法在内的不同算法。大多数选定的搜索词捕捉到了流感病例时间序列曲线的峰值和低谷。所提出的集成框架提高了传统惩罚回归模型的可预测性。弹性网络回归模型的预测误差最小,优于比较模型。我们建立了一个基于百度搜索引擎查询的流感疫情监测模型,该模型为流感和其他传染病的公共卫生应对提供了一个有用的工具。