Guangxi Key Laboratory of AIDS Prevention and Treatment & Guangxi Universities Key Laboratory of Prevention and Control of Highly Prevalent Disease, School of Public Health, Guangxi Medical University, Nanning, Guangxi, China.
State Key Laboratory for Infectious Disease Prevention and Control, National Center for AIDS/STD Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, China.
Sci Rep. 2019 Jan 23;9(1):320. doi: 10.1038/s41598-018-35685-w.
China's reported cases of Human Immunodeficiency Virus (HIV) and AIDS increased from over 50000 in 2011 to more than 130000 in 2017, while AIDS related search indices on Baidu from 2.1 million to 3.7 million in the same time periods. In China, people seek AIDS related knowledge from Baidu which one of the world's largest search engine. We study the relationship of national HIV surveillance data with the Baidu index (BDI) and use it to monitor AIDS epidemic and inform targeted intervention. After screening keywords and making index composition, we used seasonal autoregressive integrated moving average (ARIMA) modeling. The most correlated search engine query data was obtained by using ARIMA with external variables (ARIMAX) model for epidemic prediction. A significant correlation between monthly HIV/AIDS report cases and Baidu Composite Index (r = 0.845, P < 0.001) was observed using time series plot. Compared with the ARIMA model based on AIDS surveillance data, the ARIMAX model with Baidu Composite Index had the minimal an Akaike information criterion (AIC, 839.42) and the most exact prediction (MAPE of 6.11%). We showed that there are close correlations of the same trends between BDI and HIV/AIDS reports cases for both increasing and decreasing AIDS epidemic. Therefore, the Baidu search query data may be a good useful indicator for reliably monitoring and predicting HIV/AIDS epidemic in China.
中国的艾滋病病毒(HIV)和艾滋病报告病例从 2011 年的 5 万多例增加到 2017 年的 13 万多例,同期百度上的艾滋病相关搜索指数从 210 万增加到 370 万。在中国,人们从世界上最大的搜索引擎之一百度上搜索艾滋病相关知识。我们研究了国家 HIV 监测数据与百度指数(BDI)之间的关系,并利用该指数监测艾滋病疫情并为有针对性的干预措施提供信息。经过筛选关键词和制作索引成分,我们使用季节性自回归综合移动平均(ARIMA)模型。通过使用具有外部变量的 ARIMA(ARIMAX)模型(外部变量为 Baidu Composite Index)获得最相关的搜索引擎查询数据,用于预测疫情。通过时间序列图发现,每月 HIV/AIDS 报告病例与百度综合指数之间存在显著相关性(r=0.845,P<0.001)。与基于艾滋病监测数据的 ARIMA 模型相比,具有 Baidu Composite Index 的 ARIMAX 模型的赤池信息量准则(AIC,839.42)最小,预测最准确(MAPE 为 6.11%)。我们表明,在艾滋病疫情上升和下降的情况下,BDI 和 HIV/AIDS 报告病例之间存在密切的趋势相关。因此,百度搜索查询数据可能是可靠监测和预测中国艾滋病疫情的一个很好的有用指标。