Department of Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, CA 94720, USA,
Pac Symp Biocomput. 2021;26:107-118.
How has the focus of research papers on a given disease changed over time? Identifying the papers at the cusps of change can help highlight the emergence of a new topic or a change in the direction of research. We present a generally applicable unsupervised approach to this question based on semantic changepoints within a given collection of research papers. We illustrate the approach by a range of examples based on a nascent corpus of literature on COVID-19 as well as subsets of papers from PubMed on the World Health Organization list of neglected tropical diseases. The software is freely available at: https://github.com/pdddinakar/SemanticChangepointDetection.
给定疾病的研究论文的重点是如何随时间变化的?确定处于变化临界点的论文可以帮助突出新主题的出现或研究方向的变化。我们提出了一种基于给定研究论文集合内语义变化点的通用无监督方法来解决这个问题。我们通过一系列基于 COVID-19 文献初生语料库的例子以及世界卫生组织被忽视热带病清单上 PubMed 论文子集的例子来说明该方法。该软件可在以下网址免费获取:https://github.com/pdddinakar/SemanticChangepointDetection。