School of Economics & Management, Beijing Forestry University, Beijing, P.R. China.
College of Economics and Management, Beijing University of Technology, Beijing, P.R. China.
PLoS One. 2022 Sep 16;17(9):e0273725. doi: 10.1371/journal.pone.0273725. eCollection 2022.
To build a full picture of previous studies on the origins of SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2), this paper exploits an active learning-based approach to screen scholarly articles about the origins of SARS-CoV-2 from many scientific publications. In more detail, six seed articles were utilized to manually curate 170 relevant articles and 300 nonrelevant articles. Then, an active learning-based approach with three query strategies and three base classifiers is trained to screen the articles about the origins of SARS-CoV-2. Extensive experimental results show that our active learning-based approach outperforms traditional counterparts, and the uncertain sampling query strategy performs best among the three strategies. By manually checking the top 1,000 articles of each base classifier, we ultimately screened 715 unique scholarly articles to create a publicly available peer-reviewed literature corpus, COVID-Origin. This indicates that our approach for screening articles about the origins of SARS-CoV-2 is feasible.
为了全面了解关于 SARS-CoV-2(严重急性呼吸系统综合征冠状病毒 2)起源的先前研究,本文利用基于主动学习的方法从众多科学出版物中筛选有关 SARS-CoV-2 起源的学术文章。更详细地说,使用了六篇种子文章来手动整理 170 篇相关文章和 300 篇非相关文章。然后,使用三种查询策略和三种基础分类器的基于主动学习的方法来筛选有关 SARS-CoV-2 起源的文章。广泛的实验结果表明,我们的基于主动学习的方法优于传统方法,并且在三种策略中,不确定抽样查询策略表现最佳。通过手动检查每个基础分类器的前 1000 篇文章,我们最终筛选出 715 篇独特的学术文章,创建了一个公开的同行评审文献资料库 COVID-Origin。这表明,我们筛选有关 SARS-CoV-2 起源的文章的方法是可行的。