Department of Computer Science, George Washington University, Washington, DC, USA.
Disaster Med Public Health Prep. 2024 May 10;18:e103. doi: 10.1017/dmp.2024.109.
Not all scientific publications are equally useful to policy-makers tasked with mitigating the spread and impact of diseases, especially at the start of novel epidemics and pandemics. The urgent need for actionable, evidence-based information is paramount, but the nature of preprint and peer-reviewed articles published during these times is often at odds with such goals. For example, a lack of novel results and a focus on opinions rather than evidence were common in coronavirus disease (COVID-19) publications at the start of the pandemic in 2019. In this work, we seek to automatically judge the utility of these scientific articles, from a public health policy making persepctive, using only their titles.
Deep learning natural language processing (NLP) models were trained on scientific COVID-19 publication titles from the CORD-19 dataset and evaluated against expert-curated COVID-19 evidence to measure their real-world feasibility at screening these scientific publications in an automated manner.
This work demonstrates that it is possible to judge the utility of COVID-19 scientific articles, from a public health policy-making perspective, based on their title alone, using deep natural language processing (NLP) models.
NLP models can be successfully trained on scienticic articles and used by public health experts to triage and filter the hundreds of new daily publications on novel diseases such as COVID-19 at the start of pandemics.
并非所有针对疾病传播和影响的科学出版物都对决策者同样有用,尤其是在新型传染病和大流行开始时。当务之急是需要可操作的、基于证据的信息,但在这些时候发表的预印本和同行评议文章的性质往往与这些目标不一致。例如,在 2019 年大流行开始时,冠状病毒病(COVID-19)出版物中缺乏新结果,且关注意见而非证据的情况很常见。在这项工作中,我们试图仅使用标题,从公共卫生政策制定的角度自动判断这些科学文章的效用。
使用来自 CORD-19 数据集的 COVID-19 科学出版物标题对深度学习自然语言处理 (NLP) 模型进行训练,并根据专家策划的 COVID-19 证据进行评估,以衡量其在自动筛选这些科学出版物方面的实际可行性。
这项工作表明,仅使用标题,通过深度学习自然语言处理 (NLP) 模型,就可以从公共卫生政策制定的角度判断 COVID-19 科学文章的效用。
可以成功地对科学文章进行 NLP 模型训练,并由公共卫生专家在大流行开始时对数百种新型疾病(如 COVID-19)的每日新出版物进行分类和筛选。