Southern Cross Plant Science, Southern Cross University, PO Box 157, Lismore, NSW 2480, Australia.
School of Biosciences, University of Nottingham, Sutton Bonington, Leicestershire, LE12 5RD,Nottingham, Nottingham, UK.
Database (Oxford). 2021 May 15;2021. doi: 10.1093/database/baab028.
Crop phenotypic data underpin many pre-breeding efforts to characterize variation within germplasm collections. Although there has been an increase in the global capacity for accumulating and comparing such data, a lack of consistency in the systematic description of metadata often limits integration and sharing. We therefore aimed to understand some of the challenges facing findable, accesible, interoperable and reusable (FAIR) curation and annotation of phenotypic data from minor and underutilized crops. We used bambara groundnut (Vigna subterranea) as an exemplar underutilized crop to assess the ability of the Crop Ontology system to facilitate curation of trait datasets, so that they are accessible for comparative analysis. This involved generating a controlled vocabulary Trait Dictionary of 134 terms. Systematic quantification of syntactic and semantic cohesiveness of the full set of 28 crop-specific COs identified inconsistencies between trait descriptor names, a relative lack of cross-referencing to other ontologies and a flat ontological structure for classifying traits. We also evaluated the Minimal Information About a Phenotyping Experiment and FAIR compliance of bambara trait datasets curated within the CropStoreDB schema. We discuss specifications for a more systematic and generic approach to trait controlled vocabularies, which would benefit from representation of terms that adhere to Open Biological and Biomedical Ontologies principles. In particular, we focus on the benefits of reuse of existing definitions within pre- and post-composed axioms from other domains in order to facilitate the curation and comparison of datasets from a wider range of crops. Database URL: https://www.cropstoredb.org/cs_bambara.html.
作物表型数据是许多前培育工作的基础,这些工作旨在描述种质资源收集内的变异。尽管全球在积累和比较此类数据的能力方面有所提高,但元数据的系统描述缺乏一致性往往限制了集成和共享。因此,我们旨在了解一些在可发现性、可访问性、互操作性和可重用性(FAIR)方面面临的挑战,这些挑战涉及从小作物和低利用率作物中进行表型数据的编目和注释。我们使用斑鸠豌豆(Vigna subterranea)作为一个低利用率作物的范例,评估作物本体系统促进特征数据集编目的能力,以便它们可以进行比较分析。这涉及生成一个包含 134 个术语的受控词汇特征词典。对 28 个特定于作物的 CO 全集的句法和语义内聚性进行系统量化,发现特征描述符名称之间存在不一致,与其他本体的交叉引用相对较少,以及用于分类特征的扁平本体结构。我们还评估了在 CropStoreDB 架构中编目斑鸠豌豆特征数据集的最小表型实验信息和 FAIR 合规性。我们讨论了更系统和通用的特征受控词汇方法的规范,这将受益于遵守开放生物和生物医学本体原则的术语表示。特别是,我们关注在预组合和后组合公理中重用来自其他领域的现有定义的好处,以促进更广泛的作物数据集的编目和比较。数据库 URL:https://www.cropstoredb.org/cs_bambara.html。