基于统计相关性推断功能相关的N-乙酰转移酶残基

Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

作者信息

Neuwald Andrew F, Altschul Stephen F

机构信息

Institute for Genome Sciences and Department of Biochemistry & Molecular Biology, University of Maryland School of Medicine, BioPark II, Room 617, Baltimore, MD, United States of America.

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, United States of America.

出版信息

PLoS Comput Biol. 2016 Dec 21;12(12):e1005294. doi: 10.1371/journal.pcbi.1005294. eCollection 2016 Dec.

DOI:10.1371/journal.pcbi.1005294

PMID:28002465

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5225019/

Abstract

Over evolutionary time, members of a superfamily of homologous proteins sharing a common structural core diverge into subgroups filling various functional niches. At the sequence level, such divergence appears as correlations that arise from residue patterns distinct to each subgroup. Such a superfamily may be viewed as a population of sequences corresponding to a complex, high-dimensional probability distribution. Here we model this distribution as hierarchical interrelated hidden Markov models (hiHMMs), which describe these sequence correlations implicitly. By characterizing such correlations one may hope to obtain information regarding functionally-relevant properties that have thus far evaded detection. To do so, we infer a hiHMM distribution from sequence data using Bayes' theorem and Markov chain Monte Carlo (MCMC) sampling, which is widely recognized as the most effective approach for characterizing a complex, high dimensional distribution. Other routines then map correlated residue patterns to available structures with a view to hypothesis generation. When applied to N-acetyltransferases, this reveals sequence and structural features indicative of functionally important, yet generally unknown biochemical properties. Even for sets of proteins for which nothing is known beyond unannotated sequences and structures, this can lead to helpful insights. We describe, for example, a putative coenzyme-A-induced-fit substrate binding mechanism mediated by arginine residue switching between salt bridge and π-π stacking interactions. A suite of programs implementing this approach is available (psed.igs.umaryland.edu).

摘要

在进化过程中，具有共同结构核心的同源蛋白质超家族成员会分化为填充各种功能生态位的亚组。在序列水平上，这种分化表现为各亚组特有的残基模式所产生的相关性。这样一个超家族可以被看作是对应于一个复杂的高维概率分布的序列群体。在这里，我们将这种分布建模为分层相互关联的隐马尔可夫模型（hiHMMs），它隐含地描述了这些序列相关性。通过表征这种相关性，人们可能希望获得有关迄今尚未被检测到的功能相关特性的信息。为此，我们使用贝叶斯定理和马尔可夫链蒙特卡罗（MCMC）采样从序列数据中推断出hiHMM分布，这被广泛认为是表征复杂高维分布的最有效方法。然后，其他程序将相关的残基模式映射到可用结构上，以生成假设。当应用于N-乙酰转移酶时，这揭示了表明功能重要但通常未知的生化特性的序列和结构特征。即使对于除了未注释的序列和结构之外一无所知的蛋白质组，这也能带来有益的见解。例如，我们描述了一种由精氨酸残基在盐桥和π-π堆积相互作用之间切换介导的假定辅酶A诱导契合底物结合机制。一套实现这种方法的程序可供使用（psed.igs.umaryland.edu）。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aacb/5225019/c977c31ac961/pcbi.1005294.g001.jpg

相似文献

Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

PLoS Comput Biol. 2016 Dec 21;12(12):e1005294. doi: 10.1371/journal.pcbi.1005294. eCollection 2016 Dec.

Gapped alignment of protein sequence motifs through Monte Carlo optimization of a hidden Markov model.

BMC Bioinformatics. 2004 Oct 25;5:157. doi: 10.1186/1471-2105-5-157.

Bayesian models and Markov chain Monte Carlo methods for protein motifs with the secondary characteristics.

J Comput Biol. 2005 Sep;12(7):952-70. doi: 10.1089/cmb.2005.12.952.

A novel member of the GCN5-related N-acetyltransferase superfamily from Caenorhabditis elegans preferentially catalyses the N-acetylation of thialysine [S-(2-aminoethyl)-L-cysteine].

Biochem J. 2004 Nov 15;384(Pt 1):129-37. doi: 10.1042/BJ20040789.

Bayesian Top-Down Protein Sequence Alignment with Inferred Position-Specific Gap Penalties.

PLoS Comput Biol. 2016 May 18;12(5):e1004936. doi: 10.1371/journal.pcbi.1004936. eCollection 2016 May.

Bayesian coestimation of phylogeny and sequence alignment.

BMC Bioinformatics. 2005 Apr 1;6:83. doi: 10.1186/1471-2105-6-83.

Bayesian restoration of a hidden Markov chain with applications to DNA sequencing.

J Comput Biol. 1999 Summer;6(2):261-77. doi: 10.1089/cmb.1999.6.261.

Capturing protein sequence-structure specificity using computational sequence design.

Proteins. 2013 Sep;81(9):1556-70. doi: 10.1002/prot.24307. Epub 2013 Jun 20.

Prediction of protein interdomain linker regions by a hidden Markov model.

Bioinformatics. 2005 May 15;21(10):2264-70. doi: 10.1093/bioinformatics/bti363. Epub 2005 Mar 3.

Bioinformatic Identification of Functionally and Structurally Relevant Residues and Motifs in Protein S-Acyltransferases.

Methods Mol Biol. 2019;2009:191-199. doi: 10.1007/978-1-4939-9532-5_15.

引用本文的文献

Evolutionary and Functional Analysis of Caspase-8 and ASC Interactions to Drive Lytic Cell Death, PANoptosis.

Mol Biol Evol. 2025 Apr 30;42(5). doi: 10.1093/molbev/msaf096.

Highly regulated, diversifying NTP-dependent biological conflict systems with implications for the emergence of multicellularity.

Elife. 2020 Feb 26;9:e52696. doi: 10.7554/eLife.52696.

Deep Analysis of Residue Constraints (DARC): identifying determinants of protein functional specificity.

Sci Rep. 2020 Feb 3;10(1):1691. doi: 10.1038/s41598-019-55118-6.

Statistical investigations of protein residue direct couplings.

PLoS Comput Biol. 2018 Dec 31;14(12):e1006237. doi: 10.1371/journal.pcbi.1006237. eCollection 2018 Dec.

Inferring joint sequence-structural determinants of protein functional specificity.

Elife. 2018 Jan 16;7:e29880. doi: 10.7554/eLife.29880.

Initial Cluster Analysis.

J Comput Biol. 2018 Feb;25(2):121-129. doi: 10.1089/cmb.2017.0050. Epub 2017 Aug 3.

本文引用的文献

Bayesian Top-Down Protein Sequence Alignment with Inferred Position-Specific Gap Penalties.

PLoS Comput Biol. 2016 May 18;12(5):e1004936. doi: 10.1371/journal.pcbi.1004936. eCollection 2016 May.

Identification and classification of small molecule kinases: insights into substrate recognition and specificity.

BMC Evol Biol. 2016 Jan 6;16:7. doi: 10.1186/s12862-015-0576-x.

Inferring Pairwise Interactions from Biological Data Using Maximum-Entropy Probability Models.

PLoS Comput Biol. 2015 Jul 30;11(7):e1004182. doi: 10.1371/journal.pcbi.1004182. eCollection 2015 Jul.

All-atom 3D structure prediction of transmembrane β-barrel proteins from sequences.

Proc Natl Acad Sci U S A. 2015 Apr 28;112(17):5413-8. doi: 10.1073/pnas.1419956112. Epub 2015 Apr 9.

In Salmonella enterica, the Gcn5-related acetyltransferase MddA (formerly YncA) acetylates methionine sulfoximine and methionine sulfone, blocking their toxic effects.

J Bacteriol. 2015 Jan;197(2):314-25. doi: 10.1128/JB.02311-14. Epub 2014 Nov 3.

Sequence co-evolution gives 3D contacts and structures of protein complexes.

Elife. 2014 Sep 25;3:e03430. doi: 10.7554/eLife.03430.

The impact of incomplete knowledge on the evaluation of protein function prediction: a structured-output learning perspective.

Bioinformatics. 2014 Sep 1;30(17):i609-16. doi: 10.1093/bioinformatics/btu472.

Protein domain hierarchy Gibbs sampling strategies.

Stat Appl Genet Mol Biol. 2014 Aug;13(4):497-517. doi: 10.1515/sagmb-2014-0008.

Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure.

Mol Biol Evol. 2014 Sep;31(9):2251-66. doi: 10.1093/molbev/msu184. Epub 2014 Jun 4.

H2rs: deducing evolutionary and functionally important residue positions by means of an entropy and similarity based analysis of multiple sequence alignments.

BMC Bioinformatics. 2014 Apr 27;15:118. doi: 10.1186/1471-2105-15-118.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于统计相关性推断功能相关的N-乙酰转移酶残基

Inference of Functionally-Relevant N-acetyltransferase Residues Based on Statistical Correlations.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献