Suppr超能文献

一种用于进化过程中氨基酸取代的、新的富含参数的结构感知机制模型。

A new parameter-rich structure-aware mechanistic model for amino acid substitution during evolution.

作者信息

Chi Peter B, Kim Dohyup, Lai Jason K, Bykova Nadia, Weber Claudia C, Kubelka Jan, Liberles David A

机构信息

Department of Biology and Center for Computational Genetics and Genomics, Temple University, Philadelphia, Pennsylvania, 19122.

Department of Mathematics and Computer Science, Ursinus College, Collegeville, Pennsylvania, 19426.

出版信息

Proteins. 2018 Feb;86(2):218-228. doi: 10.1002/prot.25429. Epub 2017 Dec 12.

Abstract

Improvements in the description of amino acid substitution are required to develop better pseudo-energy-based protein structure-aware models for use in phylogenetic studies. These models are used to characterize the probabilities of amino acid substitution and enable better simulation of protein sequences over a phylogeny. A better characterization of amino acid substitution probabilities in turn enables numerous downstream applications, like detecting positive selection, ancestral sequence reconstruction, and evolutionarily-motivated protein engineering. Many existing Markov models for amino acid substitution in molecular evolution disregard molecular structure and describe the amino acid substitution process over longer evolutionary periods poorly. Here, we present a new model upgraded with a site-specific parameterization of pseudo-energy terms in a coarse-grained force field, which describes local heterogeneity in physical constraints on amino acid substitution better than a previous pseudo-energy-based model with minimum cost in runtime. The importance of each weight term parameterization in characterizing underlying features of the site, including contact number, solvent accessibility, and secondary structural elements was evaluated, returning both expected and biologically reasonable relationships between model parameters. This results in the acceptance of proposed amino acid substitutions that more closely resemble those observed site-specific frequencies in gene family alignments. The modular site-specific pseudo-energy function is made available for download through the following website: https://liberles.cst.temple.edu/Software/CASS/index.html.

摘要

为了开发出更好的基于伪能量的蛋白质结构感知模型用于系统发育研究,需要改进氨基酸替换的描述。这些模型用于表征氨基酸替换的概率,并能够在系统发育过程中更好地模拟蛋白质序列。对氨基酸替换概率的更好表征进而能够实现众多下游应用,如检测正选择、重建祖先序列以及基于进化的蛋白质工程。分子进化中许多现有的氨基酸替换马尔可夫模型忽略了分子结构,并且对较长进化时期内的氨基酸替换过程描述不佳。在此,我们提出一种新模型,该模型在粗粒度力场中对伪能量项进行了位点特异性参数化升级,与之前基于伪能量且运行时成本最低的模型相比,它能更好地描述氨基酸替换物理约束中的局部异质性。评估了每个权重项参数化在表征位点潜在特征(包括接触数、溶剂可及性和二级结构元件)方面的重要性,得出了模型参数之间既符合预期又具有生物学合理性的关系。这使得所提出的氨基酸替换更符合在基因家族比对中观察到的位点特异性频率。模块化的位点特异性伪能量函数可通过以下网站下载:https://liberles.cst.temple.edu/Software/CASS/index.html。

相似文献

3
Empirical models for substitution in ribosomal RNA.核糖体RNA中替代的经验模型。
Mol Biol Evol. 2004 Mar;21(3):419-27. doi: 10.1093/molbev/msh029. Epub 2003 Dec 5.

本文引用的文献

2
GenBank.基因银行
Nucleic Acids Res. 2017 Jan 4;45(D1):D37-D42. doi: 10.1093/nar/gkw1070. Epub 2016 Nov 28.
5
Selection on protein structure, interaction, and sequence.对蛋白质结构、相互作用和序列的选择。
Protein Sci. 2016 Jul;25(7):1168-78. doi: 10.1002/pro.2886. Epub 2016 Feb 11.
6
The Pfam protein families database: towards a more sustainable future.Pfam蛋白质家族数据库:迈向更可持续的未来。
Nucleic Acids Res. 2016 Jan 4;44(D1):D279-85. doi: 10.1093/nar/gkv1344. Epub 2015 Dec 15.
7
Determinants of the rate of protein sequence evolution.蛋白质序列进化速率的决定因素。
Nat Rev Genet. 2015 Jul;16(7):409-20. doi: 10.1038/nrg3950. Epub 2015 Jun 9.
8
A series of PDB-related databanks for everyday needs.一系列满足日常需求的与蛋白质数据银行(PDB)相关的数据库。
Nucleic Acids Res. 2015 Jan;43(Database issue):D364-8. doi: 10.1093/nar/gku1028. Epub 2014 Oct 28.
10
Maximum allowed solvent accessibilites of residues in proteins.蛋白质中残基的最大允许溶剂可及性。
PLoS One. 2013 Nov 21;8(11):e80635. doi: 10.1371/journal.pone.0080635. eCollection 2013.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验