• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于百度搜索指数和自回归积分滑动平均模型(ARIMAX)的中国猩红热早期预警和预测:时间序列分析。

Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis.

机构信息

School of Public Health, Guangxi Medical University, Nanning, China.

Guangxi Key Laboratory of AIDS Prevention and Treatment, Guangxi Medical University, Nanning, China.

出版信息

J Med Internet Res. 2023 Oct 30;25:e49400. doi: 10.2196/49400.

DOI:10.2196/49400
PMID:37902815
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10644180/
Abstract

BACKGROUND

Internet-derived data and the autoregressive integrated moving average (ARIMA) and ARIMA with explanatory variable (ARIMAX) models are extensively used for infectious disease surveillance. However, the effectiveness of the Baidu search index (BSI) in predicting the incidence of scarlet fever remains uncertain.

OBJECTIVE

Our objective was to investigate whether a low-cost BSI monitoring system could potentially function as a valuable complement to traditional scarlet fever surveillance in China.

METHODS

ARIMA and ARIMAX models were developed to predict the incidence of scarlet fever in China using data from the National Health Commission of the People's Republic of China between January 2011 and August 2022. The procedures included establishing a keyword database, keyword selection and filtering through Spearman rank correlation and cross-correlation analyses, construction of the scarlet fever comprehensive search index (CSI), modeling with the training sets, predicting with the testing sets, and comparing the prediction performances.

RESULTS

The average monthly incidence of scarlet fever was 4462.17 (SD 3011.75) cases, and annual incidence exhibited an upward trend until 2019. The keyword database contained 52 keywords, but only 6 highly relevant ones were selected for modeling. A high Spearman rank correlation was observed between the scarlet fever reported cases and the scarlet fever CSI (r=0.881). We developed the ARIMA(4,0,0)(0,1,2) model, and the ARIMA(4,0,0)(0,1,2) + CSI (Lag=0) and ARIMAX(1,0,2)(2,0,0) models were combined with the BSI. The 3 models had a good fit and passed the residuals Ljung-Box test. The ARIMA(4,0,0)(0,1,2), ARIMA(4,0,0)(0,1,2) + CSI (Lag=0), and ARIMAX(1,0,2)(2,0,0) models demonstrated favorable predictive capabilities, with mean absolute errors of 1692.16 (95% CI 584.88-2799.44), 1067.89 (95% CI 402.02-1733.76), and 639.75 (95% CI 188.12-1091.38), respectively; root mean squared errors of 2036.92 (95% CI 929.64-3144.20), 1224.92 (95% CI 559.04-1890.79), and 830.80 (95% CI 379.17-1282.43), respectively; and mean absolute percentage errors of 4.33% (95% CI 0.54%-8.13%), 3.36% (95% CI -0.24% to 6.96%), and 2.16% (95% CI -0.69% to 5.00%), respectively. The ARIMAX models outperformed the ARIMA models and had better prediction performances with smaller values.

CONCLUSIONS

This study demonstrated that the BSI can be used for the early warning and prediction of scarlet fever, serving as a valuable supplement to traditional surveillance systems.

摘要

背景

互联网数据和自回归积分移动平均(ARIMA)和带解释变量的 ARIMA(ARIMAX)模型广泛用于传染病监测。然而,百度搜索指数(BSI)在预测猩红热发病率方面的有效性仍不确定。

目的

本研究旨在探讨低成本的 BSI 监测系统是否有可能成为中国传统猩红热监测的有益补充。

方法

使用中国国家卫生健康委员会 2011 年 1 月至 2022 年 8 月的数据,采用 ARIMA 和 ARIMAX 模型预测中国猩红热的发病率。该过程包括建立关键词数据库、通过 Spearman 秩相关和互相关分析进行关键词选择和筛选、构建猩红热综合搜索指数(CSI)、使用训练集进行建模、使用测试集进行预测,并比较预测性能。

结果

猩红热的月平均发病率为 4462.17(SD 3011.75)例,发病率呈上升趋势,直至 2019 年。关键词数据库包含 52 个关键词,但仅选择了 6 个高度相关的关键词进行建模。猩红热报告病例与猩红热 CSI 之间存在高度的 Spearman 秩相关(r=0.881)。我们开发了 ARIMA(4,0,0)(0,1,2)模型,并且将 ARIMA(4,0,0)(0,1,2) + CSI(滞后=0)和 ARIMAX(1,0,2)(2,0,0)模型与 BSI 相结合。这 3 个模型拟合良好且通过了残差 Ljung-Box 检验。ARIMA(4,0,0)(0,1,2)、ARIMA(4,0,0)(0,1,2) + CSI(滞后=0)和 ARIMAX(1,0,2)(2,0,0)模型具有良好的预测能力,平均绝对误差分别为 1692.16(95%CI 584.88-2799.44)、1067.89(95%CI 402.02-1733.76)和 639.75(95%CI 188.12-1091.38);均方根误差分别为 2036.92(95%CI 929.64-3144.20)、1224.92(95%CI 559.04-1890.79)和 830.80(95%CI 379.17-1282.43);平均绝对百分比误差分别为 4.33%(95%CI 0.54%-8.13%)、3.36%(95%CI -0.24%至 6.96%)和 2.16%(95%CI -0.69%至 5.00%)。ARIMAX 模型优于 ARIMA 模型,且预测性能更好,误差值更小。

结论

本研究表明,BSI 可用于猩红热的早期预警和预测,是传统监测系统的有益补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5456/10644180/4b12c435590c/jmir_v25i1e49400_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5456/10644180/a86e5355fd18/jmir_v25i1e49400_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5456/10644180/4b12c435590c/jmir_v25i1e49400_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5456/10644180/a86e5355fd18/jmir_v25i1e49400_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5456/10644180/4b12c435590c/jmir_v25i1e49400_fig4.jpg

相似文献

1
Early Warning and Prediction of Scarlet Fever in China Using the Baidu Search Index and Autoregressive Integrated Moving Average With Explanatory Variable (ARIMAX) Model: Time Series Analysis.基于百度搜索指数和自回归积分滑动平均模型(ARIMAX)的中国猩红热早期预警和预测:时间序列分析。
J Med Internet Res. 2023 Oct 30;25:e49400. doi: 10.2196/49400.
2
Predicting pulmonary tuberculosis incidence in China using Baidu search index: an ARIMAX model approach.利用百度搜索指数预测中国肺结核发病率:ARIMAX 模型方法。
Environ Health Prev Med. 2023;28:68. doi: 10.1265/ehpm.23-00141.
3
Comparison of autoregressive integrated moving average model and generalised regression neural network model for prediction of haemorrhagic fever with renal syndrome in China: a time-series study.自回归综合移动平均模型与广义回归神经网络模型在中国肾综合征出血热预测中的比较:一项时间序列研究。
BMJ Open. 2019 Jun 16;9(6):e025773. doi: 10.1136/bmjopen-2018-025773.
4
[Comparison of predictive effect between the single auto regressive integrated moving average (ARIMA) model and the ARIMA-generalized regression neural network (GRNN) combination model on the incidence of scarlet fever].[单自回归积分滑动平均(ARIMA)模型与ARIMA-广义回归神经网络(GRNN)组合模型对猩红热发病率预测效果的比较]
Zhonghua Liu Xing Bing Xue Za Zhi. 2009 Sep;30(9):964-8.
5
Development of temporal modelling for forecasting and prediction of malaria infections using time-series and ARIMAX analyses: a case study in endemic districts of Bhutan.使用时间序列和 ARIMAX 分析进行疟疾感染的预测和预报的时间模型开发:来自不丹流行地区的案例研究。
Malar J. 2010 Sep 3;9:251. doi: 10.1186/1475-2875-9-251.
6
Application of an autoregressive integrated moving average model for predicting injury mortality in Xiamen, China.自回归积分移动平均模型在中国厦门预测伤害死亡率中的应用。
BMJ Open. 2015 Dec 9;5(12):e008491. doi: 10.1136/bmjopen-2015-008491.
7
Forecasting the monthly incidence of scarlet fever in Chongqing, China using the SARIMA model.利用 SARIMA 模型预测中国重庆猩红热的月发病率。
Epidemiol Infect. 2022 Apr 21;150:e90. doi: 10.1017/S0950268822000693.
8
Search trends and prediction of human brucellosis using Baidu index data from 2011 to 2018 in China.基于 2011 年至 2018 年中国百度指数数据的人布鲁氏菌病搜索趋势及预测。
Sci Rep. 2020 Apr 3;10(1):5896. doi: 10.1038/s41598-020-62517-7.
9
[Study on the epidemiological characteristics and incidence trend of scarlet fever in Shanghai, 2005-2012].[2005 - 2012年上海市猩红热流行病学特征及发病趋势研究]
Zhonghua Liu Xing Bing Xue Za Zhi. 2013 Jul;34(7):706-10.
10
Using Baidu Search Engine to Monitor AIDS Epidemics Inform for Targeted intervention of HIV/AIDS in China.利用百度搜索引擎监测艾滋病疫情信息,为中国的 HIV/AIDS 有针对性的干预提供参考。
Sci Rep. 2019 Jan 23;9(1):320. doi: 10.1038/s41598-018-35685-w.

引用本文的文献

1
A hybrid integration framework based on LOOCV and SARIMA: relationship exploring and predictive analysis between discipline attention and literature research.一种基于留一法交叉验证和季节性自回归整合移动平均模型的混合集成框架:学科关注度与文献研究之间的关系探索及预测分析
PeerJ Comput Sci. 2025 Apr 1;11:e2754. doi: 10.7717/peerj-cs.2754. eCollection 2025.
2
Global burden of lung cancer attributable to metabolic and dietary risk factors: an overview of 3 decades and forecasted trends to 2036.归因于代谢和饮食风险因素的全球肺癌负担:三十年概述及至2036年的预测趋势
Front Nutr. 2025 Mar 13;12:1534106. doi: 10.3389/fnut.2025.1534106. eCollection 2025.
3

本文引用的文献

1
Digital Disease Surveillance for Emerging Infectious Diseases: An Early Warning System Using the Internet and Social Media Data for COVID-19 Forecasting in Canada.利用互联网和社交媒体数据进行数字疾病监测以发现新发传染病:用于加拿大 COVID-19 预测的预警系统。
Stud Health Technol Inform. 2023 May 18;302:861-865. doi: 10.3233/SHTI230290.
2
Epidemiological trend in scarlet fever incidence in China during the COVID-19 pandemic: A time series analysis.中国新冠大流行期间猩红热发病率的流行病学趋势:时间序列分析。
Front Public Health. 2022 Dec 15;10:923318. doi: 10.3389/fpubh.2022.923318. eCollection 2022.
3
Spatiotemporal dynamics and potential ecological drivers of acute respiratory infectious diseases: an example of scarlet fever in Sichuan Province.
Temporal trends in prevalence and years of life lived with disability for hearing loss in China from 1990 to 2021: an analysis of the global burden of disease study 2021.
1990年至2021年中国听力损失患病率及失能生存年数的时间趋势:全球疾病负担研究2021分析
Front Public Health. 2025 Mar 4;13:1538145. doi: 10.3389/fpubh.2025.1538145. eCollection 2025.
4
Public interest in online searching of asthma information: insights from a Google trends analysis.公众对在线搜索哮喘信息的兴趣:来自谷歌趋势分析的见解。
BMC Pulm Med. 2025 Feb 13;25(1):76. doi: 10.1186/s12890-025-03545-9.
5
Impact of the COVID-19 Pandemic on the Incidence of Notifiable Infectious Diseases in China Based on SARIMA Models Between 2013 and 2021.基于 SARIMA 模型的 2013-2021 年中国法定传染病发病率与 COVID-19 大流行的关系。
J Epidemiol Glob Health. 2024 Sep;14(3):1191-1201. doi: 10.1007/s44197-024-00273-x. Epub 2024 Jul 30.
急性呼吸道传染病的时空动态及其潜在生态驱动因素:以四川省猩红热为例。
BMC Public Health. 2022 Nov 21;22(1):2139. doi: 10.1186/s12889-022-14469-y.
4
Forecasting daily Covid-19 cases in the world with a hybrid ARIMA and neural network model.使用混合自回归积分移动平均(ARIMA)和神经网络模型预测全球每日新冠病毒病例数。
Appl Soft Comput. 2022 Sep;126:109315. doi: 10.1016/j.asoc.2022.109315. Epub 2022 Jul 15.
5
Association between Meteorological Factors and Mumps and Models for Prediction in Chongqing, China.气象因素与流行性腮腺炎的关联及在中国重庆的预测模型。
Int J Environ Res Public Health. 2022 May 29;19(11):6625. doi: 10.3390/ijerph19116625.
6
Forecasting the monthly incidence of scarlet fever in Chongqing, China using the SARIMA model.利用 SARIMA 模型预测中国重庆猩红热的月发病率。
Epidemiol Infect. 2022 Apr 21;150:e90. doi: 10.1017/S0950268822000693.
7
Modeling the effects of air pollutants and meteorological factors on scarlet fever in five provinces, Northwest China, 2013-2018.模拟2013 - 2018年中国西北五省空气污染物和气象因素对猩红热的影响
J Theor Biol. 2022 Jul 7;544:111134. doi: 10.1016/j.jtbi.2022.111134. Epub 2022 Apr 22.
8
Predicting dengue incidence leveraging internet-based data sources. A case study in 20 cities in Brazil.利用互联网数据源预测登革热发病率:巴西 20 个城市的案例研究。
PLoS Negl Trop Dis. 2022 Jan 24;16(1):e0010071. doi: 10.1371/journal.pntd.0010071. eCollection 2022 Jan.
9
Protocol for Prevention and Control of COVID-19 (Edition 6).新型冠状病毒肺炎防控方案(第六版)
China CDC Wkly. 2020 May 8;2(19):321-326. doi: 10.46234/ccdcw2020.082.
10
Using Baidu search values to monitor and predict the confirmed cases of COVID-19 in China: - evidence from Baidu index.利用百度搜索指数监测和预测中国新冠肺炎确诊病例:来自百度指数的证据。
BMC Infect Dis. 2021 Jan 21;21(1):98. doi: 10.1186/s12879-020-05740-x.