文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Krallinger Martin, Malik Rainer, Valencia Alfonso

Dep. Struct. Comp. Biology Spanish National Cancer Centre (CNIO), Melchor Fernández Almagro, 3, E-28029 Madrid, Spain.

Genome Inform. 2006;17(2):121-30.

Existing biological knowledge stored as structured database records has been extracted manually by database curators analyzing the scientific literature. Most of this information was derived from sentences which describe biologically relevant aspects of genes and gene products. We introduce the Protein description sentence (Prodisen) corpus, a useful resource for the automatic identification and construction of text-based protein and gene description records using information extraction and text classification techniques. Basic guidelines and criteria relevant for the construction of a text corpus of functional descriptions of genes and proteins are proposed. The steps used for the corpus construction and its features are presented. Moreover, some of the potential applications of the Prodisen corpus for biomedical text mining purposes are explored and the obtained results are presented.

作为结构化数据库记录存储的现有生物学知识已由数据库管理员通过分析科学文献手动提取。这些信息大多来自描述基因和基因产物生物学相关方面的句子。我们引入了蛋白质描述句子（Prodisen）语料库，这是一个利用信息提取和文本分类技术自动识别和构建基于文本的蛋白质和基因描述记录的有用资源。提出了与构建基因和蛋白质功能描述文本语料库相关的基本指南和标准。介绍了语料库构建所采用的步骤及其特点。此外，还探讨了Prodisen语料库在生物医学文本挖掘方面的一些潜在应用，并展示了所获得的结果。

Krallinger Martin, Malik Rainer, Valencia Alfonso

Dep. Struct. Comp. Biology Spanish National Cancer Centre (CNIO), Melchor Fernández Almagro, 3, E-28029 Madrid, Spain.

Genome Inform. 2006;17(2):121-30.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

文本挖掘与蛋白质注释：蛋白质描述语句的构建与应用

Text mining and protein annotations: the construction and use of protein description sentences.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文本挖掘与蛋白质注释：蛋白质描述语句的构建与应用

Text mining and protein annotations: the construction and use of protein description sentences.

作者信息

机构信息

出版信息

相似文献

引用本文的文献