Blaschke Christian, Hirschman Lynette, Valencia Alfonso
Protein Design Group, National Center for Biotechnology, CNB-CSIC, Cantoblanco, Madrid, Spain.
Brief Bioinform. 2002 Jun;3(2):154-65. doi: 10.1093/bib/3.2.154.
Information extraction has become a very active field in bioinformatics recently and a number of interesting papers have been published. Most of the efforts have been concentrated on a few specific problems, such as the detection of protein-protein interactions and the analysis of DNA expression arrays, although it is obvious that there are many other interesting areas of potential application (document retrieval, protein functional description, and detection of disease-related genes to name a few). Paradoxically, these exciting developments have not yet crystallised into general agreement on a set of standard evaluation criteria, such as the ones developed in fields such as protein structure prediction, which makes it very difficult to compare performance across these different systems. In this review we introduce the general field of information extraction, we outline the status of the applications in molecular biology, and we then discuss some ideas about possible standards for evaluation that are needed for the future development of the field.
信息提取近来已成为生物信息学中一个非常活跃的领域,并且已经发表了许多有趣的论文。尽管很明显存在许多其他潜在应用的有趣领域(例如文献检索、蛋白质功能描述以及疾病相关基因的检测等),但大部分工作都集中在一些特定问题上,比如蛋白质 - 蛋白质相互作用的检测和DNA表达阵列的分析。矛盾的是,这些令人兴奋的进展尚未形成一套像蛋白质结构预测等领域所制定的那样的标准评估标准的普遍共识,这使得比较这些不同系统的性能变得非常困难。在这篇综述中,我们介绍信息提取的一般领域,概述其在分子生物学中的应用现状,然后讨论一些关于该领域未来发展所需的可能评估标准的想法。