Departamento de Estadística e Investigación Operativa, Universidad de Murcia, Murcia, Spain.
Center of Operations Research (CIO), Miguel Hernández University of Elche, Elche, Spain.
Brief Bioinform. 2020 Mar 23;21(2):473-485. doi: 10.1093/bib/bbz009.
The development and application of biological ontologies have increased significantly in recent years. These ontologies can be retrieved from different repositories, which do not provide much information about quality aspects of the ontologies. In the past years, some ontology structural metrics have been proposed, but their validity as measurement instrument has not been sufficiently studied to date. In this work, we evaluate a set of reproducible and objective ontology structural metrics. Given the lack of standard methods for this purpose, we have applied an evaluation method based on the stability and goodness of the classifications of ontologies produced by each metric on an ontology corpus. The evaluation has been done using ontology repositories as corpora. More concretely, we have used 119 ontologies from the OBO Foundry repository and 78 ontologies from AgroPortal. First, we study the correlations between the metrics. Second, we study whether the clusters for a given metric are stable and have a good structure. The results show that the existing correlations are not biasing the evaluation, there are no metrics generating unstable clusterings and all the metrics evaluated provide at least reasonable clustering structure. Furthermore, our work permits to review and suggest the most reliable ontology structural metrics in terms of stability and goodness of their classifications. Availability: http://sele.inf.um.es/ontology-metrics.
近年来,生物本体的开发和应用有了显著的发展。这些本体可以从不同的存储库中检索,但这些存储库并没有提供关于本体质量方面的太多信息。在过去的几年中,已经提出了一些本体结构度量标准,但它们作为测量工具的有效性尚未得到充分研究。在这项工作中,我们评估了一组可重复和客观的本体结构度量标准。由于缺乏为此目的的标准方法,我们应用了一种基于稳定性和每个度量对本体语料库产生的分类质量的评估方法。评估是使用本体存储库作为语料库进行的。更具体地说,我们使用了 OBO 基金会存储库中的 119 个本体和 AgroPortal 中的 78 个本体。首先,我们研究了度量之间的相关性。其次,我们研究了给定度量的聚类是否稳定且具有良好的结构。结果表明,现有相关性不会影响评估,没有度量会产生不稳定的聚类,所有评估的度量都提供了至少合理的聚类结构。此外,我们的工作允许根据稳定性和分类质量来审查和建议最可靠的本体结构度量标准。可用性:http://sele.inf.um.es/ontology-metrics。