Suppr超能文献

基于蛋白质结构中位点间的共取代预测接触残基对。

Prediction of contact residue pairs based on co-substitution between sites in protein structures.

机构信息

Graduate School of Engineering, Gunma University, Kiryu, Gunma, Japan.

出版信息

PLoS One. 2013;8(1):e54252. doi: 10.1371/journal.pone.0054252. Epub 2013 Jan 16.

Abstract

Residue-residue interactions that fold a protein into a unique three-dimensional structure and make it play a specific function impose structural and functional constraints in varying degrees on each residue site. Selective constraints on residue sites are recorded in amino acid orders in homologous sequences and also in the evolutionary trace of amino acid substitutions. A challenge is to extract direct dependences between residue sites by removing phylogenetic correlations and indirect dependences through other residues within a protein or even through other molecules. Rapid growth of protein families with unknown folds requires an accurate de novo prediction method for protein structure. Recent attempts of disentangling direct from indirect dependences of amino acid types between residue positions in multiple sequence alignments have revealed that inferred residue-residue proximities can be sufficient information to predict a protein fold without the use of known three-dimensional structures. Here, we propose an alternative method of inferring coevolving site pairs from concurrent and compensatory substitutions between sites in each branch of a phylogenetic tree. Substitution probability and physico-chemical changes (volume, charge, hydrogen-bonding capability, and others) accompanied by substitutions at each site in each branch of a phylogenetic tree are estimated with the likelihood of each substitution, and their direct correlations between sites are used to detect concurrent and compensatory substitutions. In order to extract direct dependences between sites, partial correlation coefficients of the characteristic changes along branches between sites, in which linear multiple dependences on feature vectors at other sites are removed, are calculated and used to rank coevolving site pairs. Accuracy of contact prediction based on the present coevolution score is comparable to that achieved by a maximum entropy model of protein sequences for 15 protein families taken from the Pfam release 26.0. Besides, this excellent accuracy indicates that compensatory substitutions are significant in protein evolution.

摘要

残基残基相互作用将蛋白质折叠成独特的三维结构,并使其发挥特定的功能,从而在不同程度上对每个残基位点施加结构和功能限制。残基位点的选择限制记录在同源序列中的氨基酸顺序中,也记录在氨基酸取代的进化轨迹中。一个挑战是通过去除系统发育相关性并通过蛋白质内的其他残基甚至通过其他分子去除间接相关性来提取残基位点之间的直接依赖性。具有未知折叠的蛋白质家族的快速增长需要一种准确的从头预测蛋白质结构的方法。最近尝试在多重序列比对中解开氨基酸类型之间的残基位置的直接和间接依赖性,结果表明推断出的残基残基接近度可以是足够的信息,无需使用已知的三维结构即可预测蛋白质折叠。在这里,我们提出了一种从系统发育树中每个分支的位点之间的并发和补偿替换中推断共进化位点对的替代方法。用每个替换的似然度来估计每个分支中每个位点的替换概率和物理化学变化(体积、电荷、氢键能力等),并使用它们之间的直接相关性来检测并发和补偿替换。为了提取位点之间的直接依赖性,计算了沿分支的位点之间特征变化的偏相关系数,其中去除了其他位点特征向量的线性多重依赖性,并用于对共进化的位点对进行排序。基于当前共进化评分的接触预测的准确性与从 Pfam 版本 26.0 中选择的 15 个蛋白质家族的蛋白质序列最大熵模型的预测准确性相当。此外,这种出色的准确性表明补偿性替换在蛋白质进化中很重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0e51/3546969/46252d3eefe0/pone.0054252.g001.jpg

相似文献

1
Prediction of contact residue pairs based on co-substitution between sites in protein structures.
PLoS One. 2013;8(1):e54252. doi: 10.1371/journal.pone.0054252. Epub 2013 Jan 16.
2
The Structural Determinants of Intra-Protein Compensatory Substitutions.
Mol Biol Evol. 2022 Apr 11;39(4). doi: 10.1093/molbev/msac063.
3
Using multiple interdependency to separate functional from phylogenetic correlations in protein alignments.
Bioinformatics. 2003 Apr 12;19(6):750-5. doi: 10.1093/bioinformatics/btg072.
4
Reducing the false positive rate in the non-parametric analysis of molecular coevolution.
BMC Evol Biol. 2008 Apr 10;8:106. doi: 10.1186/1471-2148-8-106.
5
Origins of coevolution between residues distant in protein 3D structures.
Proc Natl Acad Sci U S A. 2017 Aug 22;114(34):9122-9127. doi: 10.1073/pnas.1702664114. Epub 2017 Aug 7.
6
Inferring interaction partners from protein sequences using mutual information.
PLoS Comput Biol. 2018 Nov 13;14(11):e1006401. doi: 10.1371/journal.pcbi.1006401. eCollection 2018 Nov.
7
Detecting coevolution in and among protein domains.
PLoS Comput Biol. 2007 Nov;3(11):e211. doi: 10.1371/journal.pcbi.0030211. Epub 2007 Sep 18.
8
Coevolving protein residues: maximum likelihood identification and relationship to structure.
J Mol Biol. 1999 Mar 19;287(1):187-98. doi: 10.1006/jmbi.1998.2601.
9
Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction.
Bioinformatics. 2008 Feb 1;24(3):333-40. doi: 10.1093/bioinformatics/btm604. Epub 2007 Dec 5.
10
Sequence coevolution between RNA and protein characterized by mutual information between residue triplets.
PLoS One. 2012;7(1):e30022. doi: 10.1371/journal.pone.0030022. Epub 2012 Jan 18.

引用本文的文献

1
Monte Carlo simulation of a statistical mechanical model of multiple protein sequence alignment.
Biophys Physicobiol. 2017 Jul 12;14:99-110. doi: 10.2142/biophysico.14.0_99. eCollection 2017.
2
The evolution of logic circuits for the purpose of protein contact map prediction.
PeerJ. 2017 Apr 18;5:e3139. doi: 10.7717/peerj.3139. eCollection 2017.
3
A unified statistical model of protein multiple sequence alignment integrating direct coupling and insertions.
Biophys Physicobiol. 2016 Apr 22;13:45-62. doi: 10.2142/biophysico.13.0_45. eCollection 2016.
4
Liquid-theory analogy of direct-coupling analysis of multiple-sequence alignment and its implications for protein structure prediction.
Biophys Physicobiol. 2015 Dec 11;12:117-9. doi: 10.2142/biophysico.12.0_117. eCollection 2015.
5
Recent advances in functional region prediction by using structural and evolutionary information - Remaining problems and future extensions.
Comput Struct Biotechnol J. 2013 Dec 5;8:e201308007. doi: 10.5936/csbj.201308007. eCollection 2013.

本文引用的文献

1
RAxML-Light: a tool for computing terabyte phylogenies.
Bioinformatics. 2012 Aug 1;28(15):2064-6. doi: 10.1093/bioinformatics/bts309. Epub 2012 May 24.
2
Integrated analysis of residue coevolution and protein structure in ABC transporters.
PLoS One. 2012;7(5):e36546. doi: 10.1371/journal.pone.0036546. Epub 2012 May 8.
3
Fast and robust characterization of time-heterogeneous sequence evolutionary processes using substitution mapping.
PLoS One. 2012;7(3):e33852. doi: 10.1371/journal.pone.0033852. Epub 2012 Mar 27.
4
Advantages of a mechanistic codon substitution model for evolutionary analysis of protein-coding sequences.
PLoS One. 2011;6(12):e28892. doi: 10.1371/journal.pone.0028892. Epub 2011 Dec 29.
5
Structural constraints on the covariance matrix derived from multiple aligned protein sequences.
PLoS One. 2011;6(12):e28265. doi: 10.1371/journal.pone.0028265. Epub 2011 Dec 5.
6
Protein 3D structure computed from evolutionary sequence variation.
PLoS One. 2011;6(12):e28766. doi: 10.1371/journal.pone.0028766. Epub 2011 Dec 7.
8
RAxML and FastTree: comparing two methods for large-scale maximum likelihood phylogeny estimation.
PLoS One. 2011;6(11):e27731. doi: 10.1371/journal.pone.0027731. Epub 2011 Nov 21.
9
The Pfam protein families database.
Nucleic Acids Res. 2012 Jan;40(Database issue):D290-301. doi: 10.1093/nar/gkr1065. Epub 2011 Nov 29.
10
Direct-coupling analysis of residue coevolution captures native contacts across many protein families.
Proc Natl Acad Sci U S A. 2011 Dec 6;108(49):E1293-301. doi: 10.1073/pnas.1111471108. Epub 2011 Nov 21.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验