Suppr超能文献

以现有本体作为参考标准,对用于实体发现的本体学习方法进行形成性评估。

Formative evaluation of ontology learning methods for entity discovery by using existing ontologies as reference standards.

作者信息

Liu K, Mitchell K J, Chapman W W, Savova G K, Sioutos N, Rubin D L, Crowley R S

机构信息

Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.

出版信息

Methods Inf Med. 2013;52(4):308-16. doi: 10.3414/ME12-01-0029. Epub 2013 May 13.

Abstract

OBJECTIVE

Developing a two-step method for formative evaluation of statistical Ontology Learning (OL) algorithms that leverages existing biomedical ontologies as reference standards.

METHODS

In the first step optimum parameters are established. A 'gap list' of entities is generated by finding the set of entities present in a later version of the ontology that are not present in an earlier version of the ontology. A named entity recognition system is used to identify entities in a corpus of biomedical documents that are present in the 'gap list', generating a reference standard. The output of the algorithm (new entity candidates), produced by statistical methods, is subsequently compared against this reference standard. An OL method that performs perfectly will be able to learn all of the terms in this reference standard. Using evaluation metrics and precision-recall curves for different thresholds and parameters, we compute the optimum parameters for each method. In the second step, human judges with expertise in ontology development evaluate each candidate suggested by the algorithm configured with the optimum parameters previously established. These judgments are used to compute two performance metrics developed from our previous work: Entity Suggestion Rate (ESR) and Entity Acceptance Rate (EAR).

RESULTS

Using this method, we evaluated two statistical OL methods for OL in two medical domains. For the pathology domain, we obtained 49% ESR, 28% EAR with the Lin method and 52% ESR, 39% EAR with the Church method. For the radiology domain, we obtain 87% ESA, 9% EAR using Lin method and 96% ESR, 16% EAR using Church method.

CONCLUSION

This method is sufficiently general and flexible enough to permit comparison of any OL method for a specific corpus and ontology of interest.

摘要

目的

开发一种用于统计本体学习(OL)算法形成性评估的两步法,该方法利用现有的生物医学本体作为参考标准。

方法

第一步是确定最佳参数。通过找出本体的较新版本中存在而较早版本中不存在的实体集来生成实体的“差距列表”。使用命名实体识别系统来识别生物医学文档语料库中存在于“差距列表”中的实体,从而生成参考标准。随后将统计方法产生的算法输出(新实体候选)与该参考标准进行比较。表现完美的OL方法将能够学习该参考标准中的所有术语。使用针对不同阈值和参数的评估指标以及精确率-召回率曲线,我们为每种方法计算最佳参数。第二步,由本体开发方面的专业人员组成的人工评判小组对由配置了先前确定的最佳参数的算法所建议的每个候选进行评估。这些评判用于计算从我们之前的工作中得出的两个性能指标:实体建议率(ESR)和实体接受率(EAR)。

结果

使用这种方法,我们在两个医学领域评估了两种用于本体学习的统计OL方法。对于病理学领域,使用林氏方法我们得到了49%的ESR、28%的EAR,使用丘奇方法得到了52%的ESR、39%的EAR。对于放射学领域,使用林氏方法我们得到了87%的ESA、9%的EAR,使用丘奇方法得到了96%的ESR、16%的EAR。

结论

该方法具有足够的通用性和灵活性,能够对针对特定语料库和感兴趣的本体的任何OL方法进行比较。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验