Department of Statistics and Department of Food Science and Technology, University of Nebraska-Lincoln, 340 Hardin Hall North Wing, Lincoln, NE 68583, USA.
Department of Botany and Plant Pathology, Oregon State University, 2503 Cordley Hall, Corvallis, OR 97331, USA.
Database (Oxford). 2023 Nov 15;2023. doi: 10.1093/database/baad076.
Over the last couple of decades, there has been a rapid growth in the number and scope of agricultural genetics, genomics and breeding databases and resources. The AgBioData Consortium (https://www.agbiodata.org/) currently represents 44 databases and resources (https://www.agbiodata.org/databases) covering model or crop plant and animal GGB data, ontologies, pathways, genetic variation and breeding platforms (referred to as 'databases' throughout). One of the goals of the Consortium is to facilitate FAIR (Findable, Accessible, Interoperable, and Reusable) data management and the integration of datasets which requires data sharing, along with structured vocabularies and/or ontologies. Two AgBioData working groups, focused on Data Sharing and Ontologies, respectively, conducted a Consortium-wide survey to assess the current status and future needs of the members in those areas. A total of 33 researchers responded to the survey, representing 37 databases. Results suggest that data-sharing practices by AgBioData databases are in a fairly healthy state, but it is not clear whether this is true for all metadata and data types across all databases; and that, ontology use has not substantially changed since a similar survey was conducted in 2017. Based on our evaluation of the survey results, we recommend (i) providing training for database personnel in a specific data-sharing techniques, as well as in ontology use; (ii) further study on what metadata is shared, and how well it is shared among databases; (iii) promoting an understanding of data sharing and ontologies in the stakeholder community; (iv) improving data sharing and ontologies for specific phenotypic data types and formats; and (v) lowering specific barriers to data sharing and ontology use, by identifying sustainability solutions, and the identification, promotion, or development of data standards. Combined, these improvements are likely to help AgBioData databases increase development efforts towards improved ontology use, and data sharing via programmatic means. Database URL https://www.agbiodata.org/databases.
在过去的几十年中,农业遗传学、基因组学和育种数据库和资源的数量和范围迅速增长。AgBioData 联盟(https://www.agbiodata.org/)目前代表了 44 个数据库和资源(https://www.agbiodata.org/databases),涵盖了模式或作物植物和动物 GGB 数据、本体论、途径、遗传变异和育种平台(在整个文档中称为“数据库”)。该联盟的目标之一是促进 FAIR(可发现、可访问、可互操作和可重复使用)数据管理和数据集的整合,这需要数据共享以及结构化词汇表和/或本体论。AgBioData 的两个工作组,分别专注于数据共享和本体论,对整个联盟进行了调查,以评估成员在这些领域的现状和未来需求。共有 33 名研究人员对调查做出了回应,代表了 37 个数据库。结果表明,AgBioData 数据库的数据共享实践处于相当健康的状态,但尚不清楚这是否适用于所有数据库的所有元数据和数据类型;并且,自 2017 年进行类似调查以来,本体论的使用并没有实质性变化。根据我们对调查结果的评估,我们建议:(i)为数据库人员提供特定数据共享技术以及本体论使用方面的培训;(ii)进一步研究共享哪些元数据,以及数据库之间共享的程度如何;(iii)在利益相关者社区中推广对数据共享和本体论的理解;(iv)改进特定表型数据类型和格式的数据共享和本体论;(v)通过确定可持续性解决方案,以及识别、推广或开发数据标准,降低数据共享和本体论使用的具体障碍。这些改进措施结合起来,可能有助于 AgBioData 数据库增加努力,通过编程方式改进本体论的使用和数据共享。数据库网址:https://www.agbiodata.org/databases。