Suppr超能文献

将蛋白质相互作用数据库中的条目与结构化文本相链接:欧洲生物化学学会联合会快报实验

Linking entries in protein interaction database to structured text: the FEBS Letters experiment.

作者信息

Ceol Arnaud, Chatr-Aryamontri Andrew, Licata Luana, Cesareni Gianni

机构信息

Department of Biology, University of Rome, Tor Vergata, Rome, Italy.

出版信息

FEBS Lett. 2008 Apr 9;582(8):1171-7. doi: 10.1016/j.febslet.2008.02.071. Epub 2008 Mar 6.

Abstract

The corpus of the scientific literature has reached such size that a lot of useful data, dispersed throughout millions different articles, are now hard to recover. For instance, many articles in the biological domain describe relationships between entities (gene, proteins, small molecules, etc.) yet this crucial information cannot be efficiently used because of the difficulties in retrieving it automatically from unstructured text. Databases are striving to capture this valuable information and to organize it in a structured format ready for automatic analysis. However, the current database model, based on manual curation, is not sustainable because the limited support is not compatible with complete and accurate coverage of published information. Several proposals have been put forward to increase the efficiency and accuracy of the curation process. Here we present an experiment, designed by the editorial board of FEBS Letters, aimed at integrating each manuscript with a structured summary precisely reporting, with database identifiers and predefined controlled vocabularies, the protein interactions reported in the manuscript. The authors play an important role in this process as they are requested to provide structured information to be appended, in the form of human-readable paragraphs, at the end of traditional summaries. It is envisaged that the structured text will become an integral part of Medline abstracts. In 6 months time the experience gained with this experiment will form the basis for a community discussion to propose a widely accepted strategy for information storage and retrieval.

摘要

科学文献的体量已达到如此规模,以至于大量有用数据分散在数百万篇不同的文章中,如今难以找回。例如,生物领域的许多文章描述了实体(基因、蛋白质、小分子等)之间的关系,但由于难以从非结构化文本中自动检索这些关键信息,所以无法有效利用。数据库正努力获取这些有价值的信息,并将其以结构化格式组织起来以便进行自动分析。然而,当前基于人工编目的数据库模型不可持续,因为有限的支持与已发表信息的完整准确覆盖不兼容。已经提出了一些提高编目过程效率和准确性的建议。在此,我们展示一项由《欧洲生物化学学会联合会快报》编辑委员会设计的实验,旨在将每篇稿件与一个结构化摘要整合,该摘要精确报告稿件中所报道的蛋白质相互作用,并带有数据库标识符和预定义的受控词汇。作者在这个过程中发挥着重要作用,因为他们被要求以易于阅读的段落形式提供结构化信息,附加在传统摘要的末尾。预计结构化文本将成为医学文献数据库摘要的一个组成部分。在6个月的时间里,从这个实验中获得的经验将成为社区讨论的基础,以提出一种被广泛接受的信息存储和检索策略。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验