Suppr超能文献

利用单核苷酸多态性预测氨基酸替换概率

Predicting Amino Acid Substitution Probabilities Using Single Nucleotide Polymorphisms.

作者信息

Rizzato Francesca, Rodriguez Alex, Biarnés Xevi, Laio Alessandro

机构信息

Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy.

Laboratory of Biochemistry, Institut Químic de Sarrià (IQS), Universitat Ramon Llull (URL), 08017 Barcelona, Spain.

出版信息

Genetics. 2017 Oct;207(2):643-652. doi: 10.1534/genetics.117.300078. Epub 2017 Jul 28.

Abstract

Fast genome sequencing offers invaluable opportunities for building updated and improved models of protein sequence evolution. We here show that Single Nucleotide Polymorphisms (SNPs) can be used to build a model capable of predicting the probability of substitution between amino acids in variants of the same protein in different species. The model is based on a substitution matrix inferred from the frequency of codon interchanges observed in a suitably selected subset of human SNPs, and predicts the substitution probabilities observed in alignments between and related species at 85-100% of sequence identity better than any other approach we are aware of. The model gradually loses its predictive power at lower sequence identity. Our results suggest that SNPs can be employed, together with multiple sequence alignment data, to model protein sequence evolution. The SNP-based substitution matrix developed in this work can be exploited to better align protein sequences of related organisms, to refine the estimate of the evolutionary distance between protein variants from related species in phylogenetic trees and, in perspective, might become a useful tool for population analysis.

摘要

快速基因组测序为构建更新和改进的蛋白质序列进化模型提供了宝贵的机会。我们在此表明,单核苷酸多态性(SNP)可用于构建一个模型,该模型能够预测不同物种中同一蛋白质变体中氨基酸之间替换的概率。该模型基于从在适当选择的人类SNP子集中观察到的密码子互换频率推断出的替换矩阵,并且在序列同一性为85 - 100%时,比我们所知的任何其他方法都能更好地预测在目标物种与相关物种之间的比对中观察到的替换概率。在较低的序列同一性时,该模型的预测能力会逐渐丧失。我们的结果表明,SNP可与多序列比对数据一起用于构建蛋白质序列进化模型。在这项工作中开发的基于SNP的替换矩阵可用于更好地比对相关生物体的蛋白质序列,以完善系统发育树中来自相关物种的蛋白质变体之间进化距离的估计,并且从长远来看,可能会成为群体分析的一个有用工具。

相似文献

4
Empirical models for substitution in ribosomal RNA.核糖体RNA中替代的经验模型。
Mol Biol Evol. 2004 Mar;21(3):419-27. doi: 10.1093/molbev/msh029. Epub 2003 Dec 5.
5
Empirical codon substitution matrix.经验密码子替换矩阵。
BMC Bioinformatics. 2005 Jun 1;6:134. doi: 10.1186/1471-2105-6-134.
8
Phylogenetic inference with weighted codon evolutionary distances.基于加权密码子进化距离的系统发育推断。
J Mol Evol. 2009 Apr;68(4):377-92. doi: 10.1007/s00239-009-9212-y. Epub 2009 Mar 24.

本文引用的文献

2
The Pfam protein families database: towards a more sustainable future.Pfam蛋白质家族数据库:迈向更可持续的未来。
Nucleic Acids Res. 2016 Jan 4;44(D1):D279-85. doi: 10.1093/nar/gkv1344. Epub 2015 Dec 15.
3
A global reference for human genetic variation.人类遗传变异的全球参考。
Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.
5
UniProt: a hub for protein information.通用蛋白质数据库(UniProt):蛋白质信息中心。
Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12. doi: 10.1093/nar/gku989. Epub 2014 Oct 27.
6
PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome.PhylomeDB v4:深入研究基因组的多种进化历史。
Nucleic Acids Res. 2014 Jan;42(Database issue):D897-902. doi: 10.1093/nar/gkt1177. Epub 2013 Nov 25.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验