Couto Francisco M, Silva Mário J, Coutinho Pedro M
Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, Portugal.
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S21. doi: 10.1186/1471-2105-6-S1-S21. Epub 2005 May 24.
The development of text mining systems that annotate biological entities with their properties using scientific literature is an important recent research topic. These systems need first to recognize the biological entities and properties in the text, and then decide which pairs represent valid annotations.
This document introduces a novel unsupervised method for recognizing biological properties in unstructured text, involving the evidence content of their names.
This document shows the results obtained by the application of our method to BioCreative tasks 2.1 and 2.2, where it identified Gene Ontology annotations and their evidence in a set of articles.
From the performance obtained in BioCreative, we concluded that an automatic annotation system can effectively use our method to identify biological properties in unstructured text.
利用科学文献为生物实体标注其属性的文本挖掘系统的开发是近期一个重要的研究课题。这些系统首先需要识别文本中的生物实体和属性,然后确定哪些配对代表有效的标注。
本文介绍了一种用于在非结构化文本中识别生物属性的新型无监督方法,该方法涉及属性名称的证据内容。
本文展示了将我们的方法应用于生物创意任务2.1和2.2所获得的结果,该方法在一组文章中识别出了基因本体论标注及其证据。
从生物创意中获得的性能表现来看,我们得出结论,自动标注系统可以有效地使用我们的方法来识别非结构化文本中的生物属性。