Suppr超能文献

利用最大依赖分解从一组对齐的信号序列中识别保守基序。

Exploiting maximal dependence decomposition to identify conserved motifs from a group of aligned signal sequences.

机构信息

Department of Computer Science and Engineering, Yuan Ze University, Taoyuan 320, Taiwan.

出版信息

Bioinformatics. 2011 Jul 1;27(13):1780-7. doi: 10.1093/bioinformatics/btr291. Epub 2011 May 6.

Abstract

UNLABELLED

Bioinformatics research often requires conservative analyses of a group of sequences associated with a specific biological function (e.g. transcription factor binding sites, micro RNA target sites or protein post-translational modification sites). Due to the difficulty in exploring conserved motifs on a large-scale sequence data involved with various signals, a new method, MDDLogo, is developed. MDDLogo applies maximal dependence decomposition (MDD) to cluster a group of aligned signal sequences into subgroups containing statistically significant motifs. In order to extract motifs that contain a conserved biochemical property of amino acids in protein sequences, the set of 20 amino acids is further categorized according to their physicochemical properties, e.g. hydrophobicity, charge or molecular size. MDDLogo has been demonstrated to accurately identify the kinase-specific substrate motifs in 1221 human phosphorylation sites associated with seven well-known kinase families from Phospho.ELM. Moreover, in a set of plant phosphorylation data-lacking kinase information, MDDLogo has been applied to help in the investigation of substrate motifs of potential kinases and in the improvement of the identification of plant phosphorylation sites with various substrate specificities. In this study, MDDLogo is comparable with another well-known motif discover tool, Motif-X.

CONTACT

francis@saturn.yzu.edu.tw

摘要

未标记

生物信息学研究通常需要对与特定生物学功能相关的一组序列(例如转录因子结合位点、microRNA 靶位点或蛋白质翻译后修饰位点)进行保守分析。由于在涉及各种信号的大规模序列数据上探索保守基序具有难度,因此开发了一种新方法 MDDLogo。MDDLogo 将最大依赖分解 (MDD) 应用于将一组对齐的信号序列聚类为包含统计上显著基序的子组。为了提取包含蛋白质序列中氨基酸保守生化特性的基序,根据其物理化学性质,例如疏水性、电荷或分子大小,将 20 种氨基酸进一步分类。MDDLogo 已被证明能够准确识别 Phospho.ELM 中来自七个知名激酶家族的 1221 个人类磷酸化位点相关的激酶特异性底物基序。此外,在一组缺乏激酶信息的植物磷酸化数据中,MDDLogo 已被应用于帮助研究潜在激酶的底物基序,并提高各种底物特异性的植物磷酸化位点的识别。在这项研究中,MDDLogo 可与另一个著名的 motif 发现工具 Motif-X 相媲美。

联系方式

francis@saturn.yzu.edu.tw

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验