Sadilek Adam, Hswen Yulin, Bavadekar Shailesh, Shekel Tomer, Brownstein John S, Gabrilovich Evgeniy
1Google, Mountain View, CA USA.
2Department of Social and Behavioral Sciences, Harvard T.H. Chan School of Public Health, Boston, MA USA.
NPJ Digit Med. 2020 Feb 4;3:16. doi: 10.1038/s41746-020-0222-x. eCollection 2020.
Lyme disease is the most common tick-borne disease in the Northern Hemisphere. Existing estimates of Lyme disease spread are delayed a year or more. We introduce Lymelight-a new method for monitoring the incidence of Lyme disease in real-time. We use a machine-learned classifier of web search sessions to estimate the number of individuals who search for possible Lyme disease symptoms in a given geographical area for two years, 2014 and 2015. We evaluate Lymelight using the official case count data from CDC and find a 92% correlation ( < 0.001) at county level. Importantly, using web search data allows us not only to assess the incidence of the disease, but also to examine the appropriateness of treatments subsequently searched for by the users. Public health implications of our work include monitoring the spread of vector-borne diseases in a timely and scalable manner, complementing existing approaches through real-time detection, which can enable more timely interventions. Our analysis of treatment searches may also help reduce misdiagnosis of the disease.
莱姆病是北半球最常见的蜱传疾病。现有的莱姆病传播估计数据滞后一年或更长时间。我们推出了Lymelight——一种实时监测莱姆病发病率的新方法。我们使用一个机器学习的网络搜索会话分类器,来估计在2014年和2015年这两年里,在给定地理区域内搜索可能的莱姆病症状的个体数量。我们使用美国疾病控制与预防中心(CDC)的官方病例计数数据对Lymelight进行评估,发现在县一级的相关性为92%(<0.001)。重要的是,使用网络搜索数据不仅能让我们评估疾病的发病率,还能检查用户随后搜索的治疗方法是否恰当。我们这项工作对公共卫生的意义包括及时且可扩展地监测媒介传播疾病的传播情况,通过实时检测补充现有方法,从而实现更及时的干预。我们对治疗搜索的分析也可能有助于减少该疾病的误诊。