利用搜索查询欺骗性和广义脊回归估计流感发病率。

Estimating influenza incidence using search query deceptiveness and generalized ridge regression.

机构信息

Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America.

University of Colorado Boulder, Boulder, Colorado, United States of America.

出版信息

PLoS Comput Biol. 2019 Oct 1;15(10):e1007165. doi: 10.1371/journal.pcbi.1007165. eCollection 2019 Oct.

DOI:10.1371/journal.pcbi.1007165

PMID:31574086

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6771994/

Abstract

Seasonal influenza is a sometimes surprisingly impactful disease, causing thousands of deaths per year along with much additional morbidity. Timely knowledge of the outbreak state is valuable for managing an effective response. The current state of the art is to gather this knowledge using in-person patient contact. While accurate, this is time-consuming and expensive. This has motivated inquiry into new approaches using internet activity traces, based on the theory that lay observations of health status lead to informative features in internet data. These approaches risk being deceived by activity traces having a coincidental, rather than informative, relationship to disease incidence; to our knowledge, this risk has not yet been quantitatively explored. We evaluated both simulated and real activity traces of varying deceptiveness for influenza incidence estimation using linear regression. We found that deceptiveness knowledge does reduce error in such estimates, that it may help automatically-selected features perform as well or better than features that require human curation, and that a semantic distance measure derived from the Wikipedia article category tree serves as a useful proxy for deceptiveness. This suggests that disease incidence estimation models should incorporate not only data about how internet features map to incidence but also additional data to estimate feature deceptiveness. By doing so, we may gain one more step along the path to accurate, reliable disease incidence estimation using internet data. This capability would improve public health by decreasing the cost and increasing the timeliness of such estimates.

摘要

季节性流感是一种有时影响巨大的疾病，每年导致数千人死亡，并导致更多的发病率。及时了解疫情状况对于有效应对非常有价值。目前的方法是通过与患者进行面对面接触来获取这些知识。虽然这种方法准确，但耗时且昂贵。这促使人们研究使用互联网活动痕迹的新方法，其理论依据是，对健康状况的非专业观察会导致互联网数据中出现有意义的特征。这些方法存在被与疾病发病率巧合相关而非有意义相关的活动痕迹所欺骗的风险；据我们所知，这种风险尚未得到定量探讨。我们使用线性回归评估了具有不同欺骗性的模拟和真实活动痕迹，以用于流感发病率估计。我们发现，欺骗性知识确实可以降低此类估计的误差，它可以帮助自动选择的特征与需要人工策展的特征一样或更好地发挥作用，并且从维基百科文章类别树派生的语义距离度量可以作为欺骗性的有用代理。这表明，疾病发病率估计模型不仅应该包含有关互联网特征与发病率之间映射关系的数据，还应该包含其他数据来估计特征的欺骗性。通过这样做，我们可以在使用互联网数据进行准确、可靠的疾病发病率估计方面更进一步。这种能力将通过降低成本和提高此类估计的及时性来改善公共卫生。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b16/6771994/b439aff3be92/pcbi.1007165.g001.jpg

相似文献

Estimating influenza incidence using search query deceptiveness and generalized ridge regression.利用搜索查询欺骗性和广义脊回归估计流感发病率。

PLoS Comput Biol. 2019 Oct 1;15(10):e1007165. doi: 10.1371/journal.pcbi.1007165. eCollection 2019 Oct.

Monitoring seasonal influenza epidemics by using internet search data with an ensemble penalized regression model.利用集成惩罚回归模型的互联网搜索数据监测季节性流感疫情。

Sci Rep. 2017 Apr 19;7:46469. doi: 10.1038/srep46469.

Using Google Flu Trends data in forecasting influenza-like-illness related ED visits in Omaha, Nebraska.利用谷歌流感趋势数据预测内布拉斯加州奥马哈市与流感样疾病相关的急诊就诊情况。

Am J Emerg Med. 2014 Sep;32(9):1016-23. doi: 10.1016/j.ajem.2014.05.052. Epub 2014 Jun 12.

Advances in nowcasting influenza-like illness rates using search query logs.利用搜索查询日志进行流感样疾病发病率即时预报的进展。

Sci Rep. 2015 Aug 3;5:12760. doi: 10.1038/srep12760.

Real-time influenza surveillance in Germany--results of a pilot project.德国的实时流感监测——一个试点项目的结果

Med Microbiol Immunol. 2002 Dec;191(3-4):139-44. doi: 10.1007/s00430-002-0133-2. Epub 2002 Sep 14.

Internet-based monitoring of influenza-like illness in the general population: experience of five influenza seasons in The Netherlands.基于互联网的普通人群流感样疾病监测：荷兰五个流感季节的经验

Vaccine. 2009 Oct 23;27(45):6353-7. doi: 10.1016/j.vaccine.2009.05.042.

Google Flu Trends Spatial Variability Validated Against Emergency Department Influenza-Related Visits.谷歌流感趋势空间变异性与急诊科流感相关就诊情况的验证

J Med Internet Res. 2016 Jun 28;18(6):e175. doi: 10.2196/jmir.5585.

Surveillance of influenza vaccination coverage--United States, 2007-08 through 2011-12 influenza seasons.流感疫苗接种率监测-美国，2007-08 年至 2011-12 年流感季节。

MMWR Surveill Summ. 2013 Oct 25;62(4):1-28.

Real-time estimation of disease activity in emerging outbreaks using internet search information.利用互联网搜索信息实时估计新出现疫情中的疾病活动。

PLoS Comput Biol. 2020 Aug 17;16(8):e1008117. doi: 10.1371/journal.pcbi.1008117. eCollection 2020 Aug.

Methods for detecting seasonal influenza epidemics using a school absenteeism surveillance system.利用学校缺课监测系统检测季节性流感流行的方法。

BMC Public Health. 2019 Sep 5;19(1):1232. doi: 10.1186/s12889-019-7521-7.

引用本文的文献

Correlation between flu and Wikipedia's pages visualization.流感与维基百科页面可视化之间的相关性。

Acta Biomed. 2021 Feb 8;92(1):e2021056. doi: 10.23750/abm.v92i1.9790.

Public Health and Epidemiology Informatics: Recent Research Trends Moving toward Public Health Data Science.公共卫生与流行病学信息学：迈向公共卫生数据科学的近期研究趋势。

Yearb Med Inform. 2020 Aug;29(1):231-234. doi: 10.1055/s-0040-1702020. Epub 2020 Aug 21.

Even a good influenza forecasting model can benefit from internet-based nowcasts, but those benefits are limited.即使是一个好的流感预测模型也可以从基于互联网的实时预测中受益，但这种好处是有限的。

PLoS Comput Biol. 2019 Feb 1;15(2):e1006599. doi: 10.1371/journal.pcbi.1006599. eCollection 2019 Feb.

本文引用的文献

PLoS Comput Biol. 2019 Feb 1;15(2):e1006599. doi: 10.1371/journal.pcbi.1006599. eCollection 2019 Feb.

Evaluation of mechanistic and statistical methods in forecasting influenza-like illness.评估流感样疾病预测中的机制和统计方法。

J R Soc Interface. 2018 Jul;15(144). doi: 10.1098/rsif.2018.0174.

Nonmechanistic forecasts of seasonal influenza with iterative one-week-ahead distributions.基于迭代一周预测分布的季节性流感非机械预测。

PLoS Comput Biol. 2018 Jun 15;14(6):e1006134. doi: 10.1371/journal.pcbi.1006134. eCollection 2018 Jun.

Annual estimates of the burden of seasonal influenza in the United States: A tool for strengthening influenza surveillance and preparedness.美国季节性流感负担的年度估算：加强流感监测和准备的工具。

Influenza Other Respir Viruses. 2018 Jan;12(1):132-137. doi: 10.1111/irv.12486. Epub 2018 Feb 14.

Accurate Influenza Monitoring and Forecasting Using Novel Internet Data Streams: A Case Study in the Boston Metropolis.利用新型互联网数据流进行准确的流感监测与预测：以波士顿都会区为例

JMIR Public Health Surveill. 2018 Jan 9;4(1):e4. doi: 10.2196/publichealth.8950.

Measuring Global Disease with Wikipedia: Success, Failure, and a Research Agenda.利用维基百科衡量全球疾病：成功、失败与研究议程

CSCW Conf Comput Support Coop Work. 2017 Feb-Mar;2017:1812-1834. doi: 10.1145/2998181.2998183.

Pathway-Based Genomics Prediction using Generalized Elastic Net.使用广义弹性网络的基于通路的基因组学预测

PLoS Comput Biol. 2016 Mar 9;12(3):e1004790. doi: 10.1371/journal.pcbi.1004790. eCollection 2016 Mar.

Detecting signals of seasonal influenza severity through age dynamics.通过年龄动态检测季节性流感严重程度信号。

BMC Infect Dis. 2015 Dec 29;15:587. doi: 10.1186/s12879-015-1318-9.

Combining Search, Social Media, and Traditional Data Sources to Improve Influenza Surveillance.结合搜索、社交媒体和传统数据源以改善流感监测。

PLoS Comput Biol. 2015 Oct 29;11(10):e1004513. doi: 10.1371/journal.pcbi.1004513. eCollection 2015 Oct.

Comparing timeliness, content, and disease severity of formal and informal source outbreak reporting.比较正式和非正式来源的疫情报告的及时性、内容和疾病严重程度。

BMC Infect Dis. 2015 Mar 20;15:135. doi: 10.1186/s12879-015-0885-0.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用搜索查询欺骗性和广义脊回归估计流感发病率。

Estimating influenza incidence using search query deceptiveness and generalized ridge regression.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献