Yang Yunrong, Cao Zhidong, Zhao Pengfei, Zeng Dajun Daniel, Zhang Qingpeng, Luo Yin
School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100190, China.
The State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China.
J Saf Sci Resil. 2021 Sep;2(3):146-156. doi: 10.1016/j.jnlssr.2021.08.002. Epub 2021 Aug 13.
The needs of mitigating COVID-19 epidemic prompt policymakers to make public health-related decision under the guidelines of science. Tremendous unstructured COVID-19 publications make it challenging for policymakers to obtain relevant evidence. Knowledge graphs (KGs) can formalize unstructured knowledge into structured form and have been used in supporting decision-making recently. Here, we introduce a novel framework that can extract the COVID-19 public health evidence knowledge graph (CPHE-KG) from papers relating to a modelling study. We screen out a corpus of 3096 COVID-19 modelling study papers by performing a literature assessment process. We define a novel annotation schema to construct the COVID-19 modelling study-related IE dataset (CPHIE). We also propose a novel multi-tasks document-level information extraction model SS-DYGIE++ based on the dataset. Leveraging the model on the new corpus, we construct CPHE-KG containing 60,967 entities and 51,140 relations. Finally, we seek to apply our KG to support evidence querying and evidence mapping visualization. Our SS-DYGIE++(SpanBERT) model has achieved a F1 score of 0.77 and 0.55 respectively in document-level entity recognition and coreference resolution tasks. It has also shown high performance in the relation identification task. With evidence querying, our KG can present the dynamic transmissions of COVID-19 pandemic in different countries and regions. The evidence mapping of our KG can show the impacts of variable non-pharmacological interventions to COVID-19 pandemic. Analysis demonstrates the quality of our KG and shows that it has the potential to support COVID-19 policy making in public health.
缓解新冠疫情的需求促使政策制定者在科学指导下做出与公共卫生相关的决策。大量非结构化的新冠疫情相关出版物使得政策制定者获取相关证据具有挑战性。知识图谱(KGs)可以将非结构化知识形式化为结构化形式,并且最近已被用于支持决策。在此,我们引入了一个新颖的框架,该框架可以从与建模研究相关的论文中提取新冠疫情公共卫生证据知识图谱(CPHE-KG)。我们通过执行文献评估过程筛选出了3096篇新冠疫情建模研究论文的语料库。我们定义了一种新颖的注释模式来构建与新冠疫情建模研究相关的信息抽取数据集(CPHIE)。我们还基于该数据集提出了一种新颖的多任务文档级信息抽取模型SS-DYGIE++。利用该模型处理新的语料库,我们构建了包含60967个实体和51140条关系的CPHE-KG。最后,我们试图应用我们的知识图谱来支持证据查询和证据映射可视化。我们的SS-DYGIE++(SpanBERT)模型在文档级实体识别和共指消解任务中分别取得了0.77和0.55的F1分数。它在关系识别任务中也表现出了高性能。通过证据查询,我们的知识图谱可以呈现新冠疫情在不同国家和地区的动态传播情况。我们知识图谱的证据映射可以显示各种非药物干预措施对新冠疫情的影响。分析证明了我们知识图谱的质量,并表明它有潜力支持公共卫生领域的新冠疫情政策制定。