• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质超家族中的系统发育推断:SH2结构域分析

Phylogenetic inference in protein superfamilies: analysis of SH2 domains.

作者信息

Sjölander K

机构信息

Molecular Applications Group, Palo Alto, CA 94303-1110, USA.

出版信息

Proc Int Conf Intell Syst Mol Biol. 1998;6:165-74.

PMID:9783222
Abstract

This work focuses on the inference of evolutionary relationships in protein superfamilies, and the uses of these relationships to identify key positions in the structure, to infer attributes on the basis of evolutionary distance, and to identify potential errors in sequence annotations. Relative entropy, a distance metric from information theory, is used in combination with Dirichlet mixture priors to estimate a phylogenetic tree for a set of proteins. This method infers key structural or functional positions in the molecule, and guides the tree topology to preserve these important positions within subtrees. Minimum-description-length principles are used to determine a cut of the tree into subtrees, to identify the subfamilies in the data. This method is demonstrated on SH2-domain containing proteins, resulting in a new subfamily assignment for Src2-drome and a suggested evolutionary relationship between Nck_human and Drk_drome, Sem5_caeel, Grb2_human and Grb2_chick.

摘要

这项工作聚焦于蛋白质超家族中进化关系的推断,以及利用这些关系来确定结构中的关键位置、基于进化距离推断属性,和识别序列注释中的潜在错误。相对熵是一种来自信息论的距离度量,它与狄利克雷混合先验相结合,用于估计一组蛋白质的系统发育树。该方法推断分子中的关键结构或功能位置,并引导树的拓扑结构在子树中保留这些重要位置。最小描述长度原则用于确定将树切割成子树,以识别数据中的亚家族。该方法在含SH2结构域的蛋白质上得到了验证,为Src2结构域产生了新的亚家族分类,并提出了人类Nck与果蝇Drk、秀丽隐杆线虫Sem5、人类Grb2和鸡Grb2之间的进化关系。

相似文献

1
Phylogenetic inference in protein superfamilies: analysis of SH2 domains.蛋白质超家族中的系统发育推断:SH2结构域分析
Proc Int Conf Intell Syst Mol Biol. 1998;6:165-74.
2
Rate4Site: an algorithmic tool for the identification of functional regions in proteins by surface mapping of evolutionary determinants within their homologues.Rate4Site:一种通过蛋白质同源物中进化决定因素的表面映射来识别蛋白质功能区域的算法工具。
Bioinformatics. 2002;18 Suppl 1:S71-7. doi: 10.1093/bioinformatics/18.suppl_1.s71.
3
Bayesian coestimation of phylogeny and sequence alignment.系统发育与序列比对的贝叶斯联合估计
BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.
4
On the quality of tree-based protein classification.论基于树的蛋白质分类的质量。
Bioinformatics. 2005 May 1;21(9):1876-90. doi: 10.1093/bioinformatics/bti244. Epub 2005 Jan 12.
5
Hidden Markov Models for Protein Domain Homology Identification and Analysis.用于蛋白质结构域同源性鉴定与分析的隐马尔可夫模型
Methods Mol Biol. 2017;1555:47-58. doi: 10.1007/978-1-4939-6762-9_3.
6
A configuration space of homologous proteins conserving mutual information and allowing a phylogeny inference based on pair-wise Z-score probabilities.同源蛋白质的一种构象空间,其保留互信息并允许基于成对Z分数概率进行系统发育推断。
BMC Bioinformatics. 2005 Mar 10;6:49. doi: 10.1186/1471-2105-6-49.
7
Nuclear magnetic resonance solution structure of the growth factor receptor-bound protein 2 Src homology 2 domain.生长因子受体结合蛋白2的Src同源2结构域的核磁共振溶液结构
Biochemistry. 1996 Sep 10;35(36):11852-64. doi: 10.1021/bi952615s.
8
fastSCOP: a fast web server for recognizing protein structural domains and SCOP superfamilies.fastSCOP:一个用于识别蛋白质结构域和SCOP超家族的快速网络服务器。
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W438-43. doi: 10.1093/nar/gkm288. Epub 2007 May 7.
9
Elucidation of subfamily segregation and intramolecular coevolution of the olfactomedin-like proteins by comprehensive phylogenetic analysis and gene expression pattern assessment.通过全面的系统发育分析和基因表达模式评估阐明嗅觉介质样蛋白的亚家族分离和分子内协同进化。
FEBS Lett. 2005 Oct 24;579(25):5443-53. doi: 10.1016/j.febslet.2005.08.064. Epub 2005 Sep 23.
10
Src proteins/src genes: from sponges to mammals.Src蛋白/src基因:从海绵动物到哺乳动物
Gene. 2004 Nov 24;342(2):251-61. doi: 10.1016/j.gene.2004.07.044.

引用本文的文献

1
Functional classification of CATH superfamilies: a domain-based approach for protein function annotation.CATH 超家族的功能分类:一种基于结构域的蛋白质功能注释方法。
Bioinformatics. 2015 Nov 1;31(21):3460-7. doi: 10.1093/bioinformatics/btv398. Epub 2015 Jul 2.
2
Recent advances in functional region prediction by using structural and evolutionary information - Remaining problems and future extensions.利用结构和进化信息进行功能区域预测的最新进展——遗留问题与未来拓展
Comput Struct Biotechnol J. 2013 Dec 5;8:e201308007. doi: 10.5936/csbj.201308007. eCollection 2013.
3
The PhyloFacts FAT-CAT web server: ortholog identification and function prediction using fast approximate tree classification.
PhyloFacts FAT-CAT 网络服务器:使用快速近似树分类进行直系同源基因鉴定和功能预测。
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W242-8. doi: 10.1093/nar/gkt399. Epub 2013 May 18.
4
An assessment of substitution scores for protein profile-profile comparison.蛋白质图谱-图谱比较替代评分评估。
Bioinformatics. 2011 Dec 15;27(24):3356-63. doi: 10.1093/bioinformatics/btr565. Epub 2011 Oct 13.
5
Objective sequence-based subfamily classifications of mouse homeodomains reflect their in vitro DNA-binding preferences.基于目标序列的小鼠同源域亚家族分类反映了它们在体外的 DNA 结合偏好。
Nucleic Acids Res. 2010 Dec;38(22):7927-42. doi: 10.1093/nar/gkq714. Epub 2010 Aug 12.
6
The construction and use of log-odds substitution scores for multiple sequence alignment.多序列比对中对对数几率替换评分的构建和使用。
PLoS Comput Biol. 2010 Jul 15;6(7):e1000852. doi: 10.1371/journal.pcbi.1000852.
7
An automated stochastic approach to the identification of the protein specificity determinants and functional subfamilies.一种用于识别蛋白质特异性决定因素和功能亚家族的自动化随机方法。
Algorithms Mol Biol. 2010 Jul 15;5:29. doi: 10.1186/1748-7188-5-29.
8
Computing highly correlated positions using mutual information and graph theory for G protein-coupled receptors.利用互信息和图论计算G蛋白偶联受体的高度相关位置。
PLoS One. 2009;4(3):e4681. doi: 10.1371/journal.pone.0004681. Epub 2009 Mar 5.
9
Reverse conservation analysis reveals the specificity determining residues of cytochrome P450 family 2 (CYP 2).反向保守分析揭示细胞色素 P450 家族 2(CYP2)特异性决定残基。
Evol Bioinform Online. 2008 Feb 9;4:7-16. doi: 10.4137/ebo.s291.
10
Short sequence motifs, overrepresented in mammalian conserved non-coding sequences.短序列基序,在哺乳动物保守非编码序列中过度富集。
BMC Genomics. 2007 Oct 18;8:378. doi: 10.1186/1471-2164-8-378.