Xu Chenjie, Yang Hongxi, Sun Li, Cao Xinxi, Hou Yabing, Cai Qiliang, Jia Peng, Wang Yaogang
School of Public Health, Tianjin Medical University, Tianjin, China.
School of Public Health, Yale University, New Haven, CT, United States.
J Med Internet Res. 2020 Mar 12;22(3):e16184. doi: 10.2196/16184.
Internet search data on health-related terms can reflect people's concerns about their health status in near real time, and hence serve as a supplementary metric of disease characteristics. However, studies using internet search data to monitor and predict chronic diseases at a geographically finer state-level scale are sparse.
The aim of this study was to explore the associations of internet search volumes for lung cancer with published cancer incidence and mortality data in the United States.
We used Google relative search volumes, which represent the search frequency of specific search terms in Google. We performed cross-sectional analyses of the original and disease metrics at both national and state levels. A smoothed time series of relative search volumes was created to eliminate the effects of irregular changes on the search frequencies and obtain the long-term trends of search volumes for lung cancer at both the national and state levels. We also performed analyses of decomposed Google relative search volume data and disease metrics at the national and state levels.
The monthly trends of lung cancer-related internet hits were consistent with the trends of reported lung cancer rates at the national level. Ohio had the highest frequency for lung cancer-related search terms. At the state level, the relative search volume was significantly correlated with lung cancer incidence rates in 42 states, with correlation coefficients ranging from 0.58 in Virginia to 0.94 in Oregon. Relative search volume was also significantly correlated with mortality in 47 states, with correlation coefficients ranging from 0.58 in Oklahoma to 0.94 in North Carolina. Both the incidence and mortality rates of lung cancer were correlated with decomposed relative search volumes in all states excluding Vermont.
Internet search behaviors could reflect public awareness of lung cancer. Research on internet search behaviors could be a novel and timely approach to monitor and estimate the prevalence, incidence, and mortality rates of a broader range of cancers and even more health issues.
与健康相关术语的互联网搜索数据能够近乎实时地反映人们对自身健康状况的关注,因此可作为疾病特征的补充指标。然而,在地理层面更精细的州级尺度上,利用互联网搜索数据监测和预测慢性病的研究较为匮乏。
本研究旨在探究美国肺癌的互联网搜索量与已发表的癌症发病率和死亡率数据之间的关联。
我们使用了谷歌相对搜索量,它代表特定搜索词在谷歌中的搜索频率。我们在国家和州层面进行了原始指标与疾病指标的横断面分析。创建了相对搜索量的平滑时间序列,以消除搜索频率中不规则变化的影响,并获取国家和州层面肺癌搜索量的长期趋势。我们还在国家和州层面进行了分解后的谷歌相对搜索量数据与疾病指标的分析。
在国家层面,与肺癌相关的互联网搜索量的月度趋势与报告的肺癌发病率趋势一致。俄亥俄州与肺癌相关搜索词的频率最高。在州层面,相对搜索量与42个州的肺癌发病率显著相关,相关系数范围从弗吉尼亚州的0.58到俄勒冈州的0.94。相对搜索量与47个州的死亡率也显著相关,相关系数范围从俄克拉荷马州的0.58到北卡罗来纳州的0.94。除佛蒙特州外,所有州的肺癌发病率和死亡率均与分解后的相对搜索量相关。
互联网搜索行为能够反映公众对肺癌知晓情况。对互联网搜索行为的研究可能是一种新颖且及时的方法,用于监测和估计更广泛范围癌症甚至更多健康问题的患病率、发病率和死亡率。