Suppr超能文献

2011年至2015年,在辽宁省通过支持向量机回归模型整合互联网搜索查询和传统监测数据来预测流感流行情况。

Forecasting influenza epidemics by integrating internet search queries and traditional surveillance data with the support vector machine regression model in Liaoning, from 2011 to 2015.

作者信息

Liang Feng, Guan Peng, Wu Wei, Huang Desheng

机构信息

Department of Epidemiology, School of Public Health, China Medical University, Shenyang, Liaoning, China.

Department of Mathematics, School of Fundamental Sciences, China Medical University, Shenyang, Liaoning, China.

出版信息

PeerJ. 2018 Jun 25;6:e5134. doi: 10.7717/peerj.5134. eCollection 2018.

Abstract

BACKGROUND

Influenza epidemics pose significant social and economic challenges in China. Internet search query data have been identified as a valuable source for the detection of emerging influenza epidemics. However, the selection of the search queries and the adoption of prediction methods are crucial challenges when it comes to improving predictions. The purpose of this study was to explore the application of the Support Vector Machine (SVM) regression model in merging search engine query data and traditional influenza data.

METHODS

The official monthly reported number of influenza cases in Liaoning province in China was acquired from the China National Scientific Data Center for Public Health from January 2011 to December 2015. Based on Baidu Index, a publicly available search engine database, search queries potentially related to influenza over the corresponding period were identified. An SVM regression model was built to be used for predictions, and the choice of three parameters (, γ, ε) in the SVM regression model was determined by leave-one-out cross-validation (LOOCV) during the model construction process. The model's performance was evaluated by the evaluation metrics including Root Mean Square Error, Root Mean Square Percentage Error and Mean Absolute Percentage Error.

RESULTS

In total, 17 search queries related to influenza were generated through the initial query selection approach and were adopted to construct the SVM regression model, including nine queries in the same month, three queries at a lag of one month, one query at a lag of two months and four queries at a lag of three months. The SVM model performed well when with the parameters ( = 2, γ = 0.005, ɛ = 0.0001), based on the ensemble data integrating the influenza surveillance data and Baidu search query data.

CONCLUSIONS

The results demonstrated the feasibility of using internet search engine query data as the complementary data source for influenza surveillance and the efficiency of SVM regression model in tracking the influenza epidemics in Liaoning.

摘要

背景

流感疫情给中国带来了重大的社会和经济挑战。互联网搜索查询数据已被确认为检测新兴流感疫情的宝贵来源。然而,在改进预测方面,搜索查询的选择和预测方法的采用是至关重要的挑战。本研究的目的是探讨支持向量机(SVM)回归模型在合并搜索引擎查询数据和传统流感数据中的应用。

方法

从中国国家公共卫生科学数据中心获取了2011年1月至2015年12月中国辽宁省官方每月报告的流感病例数。基于公开可用的搜索引擎数据库百度指数,确定了同期可能与流感相关的搜索查询。构建了一个SVM回归模型用于预测,在模型构建过程中通过留一法交叉验证(LOOCV)确定SVM回归模型中三个参数(,γ,ε)的选择。通过均方根误差、均方根百分比误差和平均绝对百分比误差等评估指标对模型性能进行评估。

结果

通过初始查询选择方法共生成了17个与流感相关的搜索查询,并用于构建SVM回归模型,其中包括当月的9个查询、滞后1个月的3个查询、滞后2个月的1个查询和滞后3个月的4个查询。基于整合流感监测数据和百度搜索查询数据的综合数据,当参数为( = 2,γ = 0.005,ɛ = 0.0001)时,SVM模型表现良好。

结论

结果证明了将互联网搜索引擎查询数据用作流感监测补充数据源的可行性以及SVM回归模型在追踪辽宁省流感疫情方面的有效性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验