Behr Alexander S, Borgelt Hendrik, Kockmann Norbert
Laboratory of Equipment Design, Faculty of Biochemical and Chemical Engineering, TU-Dortmund University, Emil-Figge-Strasse 68, 44139, Dortmund, NRW, Germany.
J Cheminform. 2024 Feb 7;16(1):16. doi: 10.1186/s13321-024-00807-2.
As scientific digitization advances it is imperative ensuring data is Findable, Accessible, Interoperable, and Reusable (FAIR) for machine-processable data. Ontologies play a vital role in enhancing data FAIRness by explicitly representing knowledge in a machine-understandable format. Research data in catalysis research often exhibits complexity and diversity, necessitating a respectively broad collection of ontologies. While ontology portals such as EBI OLS and BioPortal aid in ontology discovery, they lack deep classification, while quality metrics for ontology reusability and domains are absent for the domain of catalysis research. Thus, this work provides an approach for systematic collection of ontology metadata with focus on the catalysis research data value chain. By classifying ontologies by subdomains of catalysis research, the approach is offering efficient comparison across ontologies. Furthermore, a workflow and codebase is presented, facilitating representation of the metadata on GitHub. Finally, a method is presented to automatically map the classes contained in the ontologies of the metadata collection against each other, providing further insights on relatedness of the ontologies listed. The presented methodology is designed for its reusability, enabling its adaptation to other ontology collections or domains of knowledge. The ontology metadata taken up for this work and the code developed and described in this work are available in a GitHub repository at: https://github.com/nfdi4cat/Ontology-Overview-of-NFDI4Cat .
随着科学数字化的发展,确保机器可处理的数据具有可查找、可访问、可互操作和可重用(FAIR)性至关重要。本体通过以机器可理解的格式明确表示知识,在提高数据的FAIR性方面发挥着至关重要的作用。催化研究中的研究数据通常表现出复杂性和多样性,因此需要广泛收集相应的本体。虽然诸如欧洲生物信息研究所本体库(EBI OLS)和生物本体库(BioPortal)等本体门户有助于本体发现,但它们缺乏深度分类,而且催化研究领域缺乏本体可重用性和领域的质量指标。因此,这项工作提供了一种系统收集本体元数据的方法,重点关注催化研究数据价值链。通过按催化研究的子领域对本体进行分类,该方法能够对不同本体进行高效比较。此外,还展示了一个工作流程和代码库,便于在GitHub上表示元数据。最后,提出了一种方法,用于自动将元数据收集中本体所包含的类相互映射,从而进一步深入了解所列本体的相关性。所提出的方法旨在实现可重用性,使其能够适用于其他本体集合或知识领域。这项工作所采用的本体元数据以及在这项工作中开发和描述代码可在GitHub仓库中获取:https://github.com/nfdi4cat/Ontology-Overview-of-NFDI4Cat 。