Suppr超能文献

统计严格的自动化蛋白质注释。

Statistically rigorous automated protein annotation.

作者信息

Krebs Werner G, Bourne Philip E

机构信息

San Diego Supercomputer Center, San Diego, California, USA.

出版信息

Bioinformatics. 2004 May 1;20(7):1066-73. doi: 10.1093/bioinformatics/bth039. Epub 2004 Feb 5.

Abstract

MOTIVATION

Assignment of putative protein functional annotation by comparative analysis using pre-defined experimental annotations is performed routinely by molecular biologists. The number and statistical significance of these assignments remains a challenge in this era of high-throughput proteomics. A combined statistical method that enables robust, automated protein annotation by reliably expanding existing annotation sets is described. An existing clustering scheme, based on relevant experimental information (e.g. sequence identity, keywords or gene expression data) is required. The method assigns new proteins to these clusters with a measure of reliability. It can also provide human reviewers with a reliability score for both new and previously classified proteins.

RESULTS

A dataset of 27 000 annotated Protein Data Bank (PDB) polypeptide chains (of 36 000 chains currently in the PDB) was generated from 23 000 chains classified a priori.

AVAILABILITY

PDB annotations and sample software implementation are freely accessible on the Web at http://pmr.sdsc.edu/go

摘要

动机

分子生物学家通常通过使用预定义的实验注释进行比较分析来指定假定的蛋白质功能注释。在这个高通量蛋白质组学时代,这些注释的数量和统计显著性仍然是一个挑战。本文描述了一种组合统计方法,该方法通过可靠地扩展现有注释集来实现强大的自动化蛋白质注释。需要一个基于相关实验信息(例如序列同一性、关键词或基因表达数据)的现有聚类方案。该方法以可靠性度量将新蛋白质分配到这些聚类中。它还可以为人类评审员提供新蛋白质和先前分类蛋白质的可靠性评分。

结果

从预先分类的23000条链中生成了一个包含27000条带注释的蛋白质数据库(PDB)多肽链的数据集(PDB中目前有36000条链)。

可用性

PDB注释和示例软件实现可在网站http://pmr.sdsc.edu/go上免费获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验