Suppr超能文献

SIMPROT:在蛋白质进化模拟中使用经验确定的插入缺失分布。

SIMPROT: using an empirically determined indel distribution in simulations of protein evolution.

作者信息

Pang Andy, Smith Andrew D, Nuin Paulo A S, Tillier Elisabeth R M

机构信息

Ontario Cancer Institute, University Health Network, Toronto, Ontario, Canada.

出版信息

BMC Bioinformatics. 2005 Sep 27;6:236. doi: 10.1186/1471-2105-6-236.

Abstract

BACKGROUND

General protein evolution models help determine the baseline expectations for the evolution of sequences, and they have been extensively useful in sequence analysis and for the computer simulation of artificial sequence data sets.

RESULTS

We have developed a new method of simulating protein sequence evolution, including insertion and deletion (indel) events in addition to amino-acid substitutions. The simulation generates both the simulated sequence family and a true sequence alignment that captures the evolutionary relationships between amino acids from different sequences. Our statistical model for indel evolution is based on the empirical indel distribution determined by Qian and Goldstein. We have parameterized this distribution so that it applies to sequences diverged by varying evolutionary times and generalized it to provide flexibility in simulation conditions. Our method uses a Monte-Carlo simulation strategy, and has been implemented in a C++ program named Simprot.

CONCLUSION

Simprot will be useful for testing methods of analysis of protein sequence families particularly alignment methods, phylogenetic tree building, detection of recombination and horizontal gene transfer, and homology detection, where knowing the true course of sequence evolution is essential.

摘要

背景

通用蛋白质进化模型有助于确定序列进化的基线预期,并且它们在序列分析以及人工序列数据集的计算机模拟中具有广泛的用途。

结果

我们开发了一种模拟蛋白质序列进化的新方法,除了氨基酸替换之外,还包括插入和缺失(indel)事件。该模拟生成模拟序列家族以及真实序列比对,后者能够捕捉不同序列中氨基酸之间的进化关系。我们用于indel进化的统计模型基于钱和戈尔茨坦确定的经验indel分布。我们对该分布进行了参数化,使其适用于因不同进化时间而分化的序列,并对其进行了推广,以便在模拟条件下提供灵活性。我们的方法采用蒙特卡罗模拟策略,并已在一个名为Simprot的C++程序中实现。

结论

Simprot将有助于测试蛋白质序列家族的分析方法,特别是比对方法、系统发育树构建、重组和水平基因转移检测以及同源性检测,在这些分析中了解序列进化的真实过程至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/646c/1261159/53ce8afcf984/1471-2105-6-236-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验