School of Life Sciences, Arizona State University, Tempe, AZ, USA.
Ronin Institute for Independent Scholarship, Montclair, NJ, USA; Cheadle Center for Biodiversity and Ecological Restoration, University of California Santa Barbara, Santa Barbara, CA, USA.
Lancet Planet Health. 2021 Oct;5(10):e746-e750. doi: 10.1016/S2542-5196(21)00196-0. Epub 2021 Sep 23.
Connecting basic data about bats and other potential hosts of SARS-CoV-2 with their ecological context is crucial to the understanding of the emergence and spread of the virus. However, when lockdowns in many countries started in March, 2020, the world's bat experts were locked out of their research laboratories, which in turn impeded access to large volumes of offline ecological and taxonomic data. Pandemic lockdowns have brought to attention the long-standing problem of so-called biological dark data: data that are published, but disconnected from digital knowledge resources and thus unavailable for high-throughput analysis. Knowledge of host-to-virus ecological interactions will be biased until this challenge is addressed. In this Viewpoint, we outline two viable solutions: first, in the short term, to interconnect published data about host organisms, viruses, and other pathogens; and second, to shift the publishing framework beyond unstructured text (the so-called PDF prison) to labelled networks of digital knowledge. As the indexing system for biodiversity data, biological taxonomy is foundational to both solutions. Building digitally connected knowledge graphs of host-pathogen interactions will establish the agility needed to quickly identify reservoir hosts of novel zoonoses, allow for more robust predictions of emergence, and thereby strengthen human and planetary health systems.
将有关蝙蝠和其他可能是 SARS-CoV-2 宿主的基本数据与其生态背景联系起来,对于理解病毒的出现和传播至关重要。然而,当 2020 年 3 月许多国家开始实施封锁时,世界上的蝙蝠专家却被挡在了他们的研究实验室之外,这反过来又阻碍了对大量离线生态和分类学数据的获取。大流行封锁引起了人们对所谓的生物暗数据长期存在问题的关注:这些数据已经发表,但与数字知识资源没有联系,因此无法进行高通量分析。在解决这一挑战之前,有关宿主-病毒生态相互作用的知识将存在偏差。在本观点中,我们概述了两种可行的解决方案:首先,在短期内,将有关宿主生物、病毒和其他病原体的已发表数据相互关联;其次,将出版框架从非结构化文本(所谓的 PDF 监狱)转移到标记的数字知识网络。作为生物多样性数据的索引系统,生物分类学是这两种解决方案的基础。建立宿主-病原体相互作用的数字化连接知识图谱,将为快速确定新型人畜共患病的储存宿主提供所需的灵活性,从而更准确地预测疾病的出现,并加强人类和地球的健康系统。