Suppr超能文献

毕加索:生成蛋白质家族图谱的覆盖集

Picasso: generating a covering set of protein family profiles.

作者信息

Heger A, Holm L

机构信息

Structural Genomics Group, EMBL-EBI, Cambridge CB10 1SD, UK.

出版信息

Bioinformatics. 2001 Mar;17(3):272-9. doi: 10.1093/bioinformatics/17.3.272.

Abstract

MOTIVATION

Evolutionary classification leads to an economical description of protein sequence data because attributes of function and structure are inherited in protein families. This paper presents Picasso, a procedure for deriving a minimal set of protein family profiles that cover all known protein sequences.

RESULTS

Picasso starts from highly overlapping sequence neighbourhoods revealed by all-on-all pairwise Blast alignment. Overlaps are reduced by merging sequences or parts of sequences into multiple alignments. For maximum unification, the multiple alignments must reach into the twilight zone of sequence similarity. Sensitive and selective profile-profile comparison allows unification down to about 15% pairwise sequence identity. Families unified through a short conserved sequence motif are associated with multiple full-length alignments describing different subfamilies. Domains that are mobile modules are identified based on their association with different sets of neighbours. The result is 10000 unified domain families (excluding singletons) representing functionally related proteins and recovering classical prolific domain types in high numbers. The classification is useful, for example, in developing strategies for efficient database searching and for selecting targets to complete the map of all 3-D structures.

摘要

动机

进化分类能够对蛋白质序列数据进行经济有效的描述,因为功能和结构属性在蛋白质家族中是可遗传的。本文介绍了Picasso,这是一种用于推导覆盖所有已知蛋白质序列的最小蛋白质家族谱集的程序。

结果

Picasso从通过全对全两两Blast比对揭示的高度重叠的序列邻域开始。通过将序列或序列的部分合并到多序列比对中来减少重叠。为了实现最大程度的统一,多序列比对必须深入到序列相似性的模糊区域。灵敏且具有选择性的谱-谱比较允许统一到约15%的两两序列同一性。通过短保守序列基序统一的家族与描述不同亚家族的多个全长比对相关联。基于与不 同邻域集的关联来识别作为移动模块的结构域。结果是得到了10000个统一的结构域家族(不包括单例),它们代表功能相关的蛋白质并且大量恢复了经典的丰富结构域类型。这种分类例如在制定高效数据库搜索策略以及选择目标以完成所有三维结构图谱方面是有用的。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验