Suppr超能文献

通过从MEDLINE摘要中提取关键词对生物序列进行自动注释。一个原型系统的开发。

Automatic annotation for biological sequences by extraction of keywords from MEDLINE abstracts. Development of a prototype system.

作者信息

Andrade M A, Valencia A

机构信息

European Bioinformatics Institute, Hinxton, Cambridge, UK.

出版信息

Proc Int Conf Intell Syst Mol Biol. 1997;5:25-32.

PMID:9322011
Abstract

We have developed a prototype for the automatic annotation of functional characteristics in protein families. The system is able to extract biological information directly from scientific literature in the form of MEDLINE abstracts. The criterion for selecting relevant keywords is the difference between their frequency in the abstracts associated with the protein family under study and its frequency in other unrelated protein families. The concept of functional information associated to protein families is the key feature of our system and gathers evolutionary information into the problem of functional annotation of biological sequences. The system has been tested in two different scenarios: first, a large set of protein families with a small number of abstract per family and second, selected protein families with large number of abstracts attached to each one. In both cases the performances are compared with annotations provided by human experts showing a clear relation between the amount of information provided to the system and the quality of the annotations. The automatic annotations are in many cases of similar quality to the ones contained in current data bases. The possibilities and difficulties to be encountered during the development of a full system for automatic annotation are discussed.

摘要

我们已经开发出一种用于蛋白质家族功能特征自动注释的原型系统。该系统能够直接从MEDLINE摘要形式的科学文献中提取生物学信息。选择相关关键词的标准是它们在与所研究蛋白质家族相关的摘要中的出现频率与其在其他不相关蛋白质家族中的出现频率之间的差异。与蛋白质家族相关的功能信息概念是我们系统的关键特征,并将进化信息纳入生物序列功能注释问题中。该系统已在两种不同场景下进行了测试:第一,一大组每个家族只有少量摘要的蛋白质家族;第二,每个都附有大量摘要的选定蛋白质家族。在这两种情况下,都将性能与人类专家提供的注释进行了比较,结果表明提供给系统的信息量与注释质量之间存在明显关系。自动注释在许多情况下与当前数据库中的注释质量相似。本文还讨论了开发完整自动注释系统过程中可能遇到的可能性和困难。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验