School of Biomedical Informatics, University of Texas Health Science Center, Fannin Street, Houston, USA.
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):73. doi: 10.1186/s12911-017-0465-x.
Knowledge engineering for ontological knowledgebases is resource and time intensive. To alleviate these issues, especially for novices, automated tools from the natural language domain can assist in the development process of ontologies. We focus towards the development of ontologies for the public health domain and use patient-centric sources from MedlinePlus related to HPV-causing cancers.
This paper demonstrates the use of a lightweight open information extraction (OIE) tool to derive accurate knowledge triples that can lead to the seeding of an ontological knowledgebase. We developed a custom application, which interfaced with an information extraction software library, to help facilitate the tasks towards producing knowledge triples from textual sources.
The results of our efforts generated accurate extractions ranging from 80-89% precision. These triples can later be transformed to OWL/RDF representation for our planned ontological knowledgebase.
OIE delivers an effective and accessible method towards the development ontologies.
本体知识库的知识工程既耗费资源又耗时。为了缓解这些问题,特别是对于新手来说,可以使用来自自然语言领域的自动化工具来辅助本体的开发过程。我们专注于开发公共卫生领域的本体,并使用与 HPV 致癌相关的 MedlinePlus 中的以患者为中心的资源。
本文展示了使用轻量级开放式信息抽取(OIE)工具来提取准确的知识三元组,从而为本体知识库的种子提供信息。我们开发了一个定制应用程序,它与信息抽取软件库接口,以帮助从文本源中生成知识三元组。
我们的努力产生了准确的提取结果,精度范围在 80-89%之间。这些三元组可以转换为 OWL/RDF 表示形式,以便用于我们计划的本体知识库。
OIE 提供了一种有效且易于使用的本体开发方法。