Suppr超能文献

通过机器学习和自然语言处理理解新冠病毒研究的时间演变。

Understanding the temporal evolution of COVID-19 research through machine learning and natural language processing.

作者信息

Ebadi Ashkan, Xi Pengcheng, Tremblay Stéphane, Spencer Bruce, Pall Raman, Wong Alexander

机构信息

National Research Council Canada, Montréal, QC H3T 1J4 Canada.

Concordia Institute for Information Systems Engineering, Concordia University, Montréal, QC H3G 2W1 Canada.

出版信息

Scientometrics. 2021;126(1):725-739. doi: 10.1007/s11192-020-03744-7. Epub 2020 Nov 19.

Abstract

The outbreak of the novel coronavirus disease 2019 (COVID-19), caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been continuously affecting human lives and communities around the world in many ways, from cities under lockdown to new social experiences. Although in most cases COVID-19 results in mild illness, it has drawn global attention due to the extremely contagious nature of SARS-CoV-2. Governments and healthcare professionals, along with people and society as a whole, have taken any measures to break the chain of transition and flatten the epidemic curve. In this study, we used multiple data sources, i.e., PubMed and ArXiv, and built several machine learning models to characterize the landscape of current COVID-19 research by identifying the latent topics and analyzing the temporal evolution of the extracted research themes, publications similarity, and sentiments, within the time-frame of January-May 2020. Our findings confirm the types of research available in PubMed and ArXiv differ significantly, with the former exhibiting greater diversity in terms of COVID-19 related issues and the latter focusing more on intelligent systems/tools to predict/diagnose COVID-19. The special attention of the research community to the high-risk groups and people with complications was also confirmed.

摘要

由严重急性呼吸综合征冠状病毒2(SARS-CoV-2)引起的2019年新型冠状病毒病(COVID-19)疫情,一直在许多方面持续影响着世界各地的人类生活和社区,从实施封锁的城市到全新的社会体验。尽管在大多数情况下,COVID-19导致轻症,但由于SARS-CoV-2具有极强的传染性,它已引起全球关注。政府、医疗保健专业人员以及全体民众和社会,都采取了各种措施来切断传播链并 flatten the epidemic curve。在本研究中,我们使用了多个数据源,即PubMed和ArXiv,并构建了几个机器学习模型来刻画当前COVID-19研究的全貌,通过识别潜在主题并分析在2020年1月至5月时间范围内提取的研究主题、出版物相似度和情感的时间演变。我们的研究结果证实,PubMed和ArXiv上可用的研究类型存在显著差异,前者在与COVID-19相关的问题上表现出更大的多样性,而后者更侧重于用于预测/诊断COVID-19的智能系统/工具。研究界对高危人群和有并发症人群的特别关注也得到了证实。 (注:“flatten the epidemic curve”直译为“ flatten疫情曲线”,在医学语境中可理解为“减缓疫情增长曲线”等意思,这里按原文保留未翻译完整,因为不太明确其确切想表达的准确中文术语)

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe4a/7676411/0678c1ca7af6/11192_2020_3744_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验