Mavragani Amaryllis, Sampri Alexia, Sypsa Karla, Tsagarakis Konstantinos P
Department of Computing Science and Mathematics, Faculty of Natural Sciences, University of Stirling, Stirling, United Kingdom.
Department of Pharmacy and Forensic Science, King's College London, University of London, London, United Kingdom.
JMIR Public Health Surveill. 2018 Mar 12;4(1):e24. doi: 10.2196/publichealth.8726.
With the internet's penetration and use constantly expanding, this vast amount of information can be employed in order to better assess issues in the US health care system. Google Trends, a popular tool in big data analytics, has been widely used in the past to examine interest in various medical and health-related topics and has shown great potential in forecastings, predictions, and nowcastings. As empirical relationships between online queries and human behavior have been shown to exist, a new opportunity to explore the behavior toward asthma-a common respiratory disease-is present.
This study aimed at forecasting the online behavior toward asthma and examined the correlations between queries and reported cases in order to explore the possibility of nowcasting asthma prevalence in the United States using online search traffic data.
Applying Holt-Winters exponential smoothing to Google Trends time series from 2004 to 2015 for the term "asthma," forecasts for online queries at state and national levels are estimated from 2016 to 2020 and validated against available Google query data from January 2016 to June 2017. Correlations among yearly Google queries and between Google queries and reported asthma cases are examined.
Our analysis shows that search queries exhibit seasonality within each year and the relationships between each 2 years' queries are statistically significant (P<.05). Estimated forecasting models for a 5-year period (2016 through 2020) for Google queries are robust and validated against available data from January 2016 to June 2017. Significant correlations were found between (1) online queries and National Health Interview Survey lifetime asthma (r=-.82, P=.001) and current asthma (r=-.77, P=.004) rates from 2004 to 2015 and (2) between online queries and Behavioral Risk Factor Surveillance System lifetime (r=-.78, P=.003) and current asthma (r=-.79, P=.002) rates from 2004 to 2014. The correlations are negative, but lag analysis to identify the period of response cannot be employed until short-interval data on asthma prevalence are made available.
Online behavior toward asthma can be accurately predicted, and significant correlations between online queries and reported cases exist. This method of forecasting Google queries can be used by health care officials to nowcast asthma prevalence by city, state, or nationally, subject to future availability of daily, weekly, or monthly data on reported cases. This method could therefore be used for improved monitoring and assessment of the needs surrounding the current population of patients with asthma.
随着互联网普及率和使用率的不断提高,可以利用这些海量信息来更好地评估美国医疗保健系统中的问题。谷歌趋势(Google Trends)是大数据分析中的一种常用工具,过去已被广泛用于研究对各种医学及健康相关主题的关注度,并在预测、预报和实时预报方面展现出巨大潜力。由于在线搜索查询与人类行为之间已被证明存在实证关系,因此出现了一个探索针对哮喘(一种常见的呼吸系统疾病)的行为的新机会。
本研究旨在预测针对哮喘的在线行为,并研究搜索查询与报告病例之间的相关性,以探索利用在线搜索流量数据对美国哮喘患病率进行实时预报的可能性。
将霍尔特-温特斯指数平滑法应用于2004年至2015年期间关于“哮喘”一词的谷歌趋势时间序列,估计2016年至2020年州和国家层面的在线搜索查询,并根据2016年1月至2017年6月的可用谷歌查询数据进行验证。研究年度谷歌查询之间以及谷歌查询与报告的哮喘病例之间的相关性。
我们的分析表明,搜索查询在每年内呈现季节性,且每两年的查询之间的关系具有统计学意义(P<0.05)。谷歌查询的5年期(2016年至2020年)估计预测模型稳健,并根据2016年1月至2017年6月的可用数据进行了验证。发现(1)2004年至2015年期间在线查询与国家健康访谈调查终身哮喘患病率(r = -0.82,P = 0.001)和当前哮喘患病率(r = -0.77,P = 0.004)之间以及(2)2004年至2014年期间在线查询与行为危险因素监测系统终身哮喘患病率(r = -0.78,P = 0.003)和当前哮喘患病率(r = -0.79,P = 0.002)之间存在显著相关性。这些相关性为负,但在获得哮喘患病率的短间隔数据之前,无法进行滞后分析以确定反应期。
针对哮喘的在线行为可以被准确预测,并且在线查询与报告病例之间存在显著相关性。医疗保健官员可以使用这种预测谷歌查询的方法,根据未来报告病例的每日、每周或每月数据,对城市、州或全国范围内的哮喘患病率进行实时预报。因此,这种方法可用于改进对当前哮喘患者群体需求的监测和评估。