Institute of Computational Linguistics, University of Zurich, Binzmuhlestrasse 14, Zurich 8050, Switzerland.
Database (Oxford). 2013 Feb 9;2013:bas053. doi: 10.1093/database/bas053. Print 2013.
In this article, we describe the architecture of the OntoGene Relation mining pipeline and its application in the triage task of BioCreative 2012. The aim of the task is to support the triage of abstracts relevant to the process of curation of the Comparative Toxicogenomics Database. We use a conventional information retrieval system (Lucene) to provide a baseline ranking, which we then combine with information provided by our relation mining system, in order to achieve an optimized ranking. Our approach additionally delivers domain entities mentioned in each input document as well as candidate relationships, both ranked according to a confidence score computed by the system. This information is presented to the user through an advanced interface aimed at supporting the process of interactive curation. Thanks, in particular, to the high-quality entity recognition, the OntoGene system achieved the best overall results in the task.
在本文中,我们描述了 OntoGene 关系挖掘管道的架构及其在 BioCreative 2012 分诊任务中的应用。该任务的目的是支持与比较毒理学基因组数据库编纂过程相关的摘要的分诊。我们使用传统的信息检索系统(Lucene)提供基线排名,然后将其与我们的关系挖掘系统提供的信息结合起来,以实现优化的排名。我们的方法还提供了在每个输入文档中提到的领域实体以及候选关系,根据系统计算的置信度得分进行排名。该信息通过一个旨在支持交互式编纂过程的高级界面呈现给用户。特别是由于高质量的实体识别,OntoGene 系统在任务中取得了最佳的整体结果。