Computational Biochemistry Research Group, Department of Computer Science, ETH Zurich, Universitätstrasse 6, Zürich, Switzerland.
Mol Biol Evol. 2012 Apr;29(4):1115-23. doi: 10.1093/molbev/msr268. Epub 2011 Dec 8.
In computational evolutionary biology, verification and benchmarking is a challenging task because the evolutionary history of studied biological entities is usually not known. Computer programs for simulating sequence evolution in silico have shown to be viable test beds for the verification of newly developed methods and to compare different algorithms. However, current simulation packages tend to focus either on gene-level aspects of genome evolution such as character substitutions and insertions and deletions (indels) or on genome-level aspects such as genome rearrangement and speciation events. Here, we introduce Artificial Life Framework (ALF), which aims at simulating the entire range of evolutionary forces that act on genomes: nucleotide, codon, or amino acid substitution (under simple or mixture models), indels, GC-content amelioration, gene duplication, gene loss, gene fusion, gene fission, genome rearrangement, lateral gene transfer (LGT), or speciation. The other distinctive feature of ALF is its user-friendly yet powerful web interface. We illustrate the utility of ALF with two possible applications: 1) we reanalyze data from a study of selection after globin gene duplication and test the statistical significance of the original conclusions and 2) we demonstrate that LGT can dramatically decrease the accuracy of two well-established orthology inference methods. ALF is available as a stand-alone application or via a web interface at http://www.cbrg.ethz.ch/alf.
在计算进化生物学中,验证和基准测试是一项具有挑战性的任务,因为所研究的生物实体的进化历史通常是未知的。用于在计算机上模拟序列进化的计算机程序已被证明是验证新开发方法和比较不同算法的可行测试平台。然而,当前的模拟软件包往往要么侧重于基因组进化的基因水平方面,例如特征替换和插入和缺失(indels),要么侧重于基因组水平方面,例如基因组重排和物种形成事件。在这里,我们介绍人工生命框架(ALF),它旨在模拟作用于基因组的整个进化力量范围:核苷酸、密码子或氨基酸替换(在简单或混合模型下)、indels、GC 含量改善、基因复制、基因丢失、基因融合、基因分裂、基因组重排、横向基因转移(LGT)或物种形成。ALF 的另一个独特特点是其用户友好且功能强大的网络界面。我们通过两个可能的应用来说明 ALF 的实用性:1)我们重新分析了球蛋白基因复制后选择研究的数据,并测试了原始结论的统计显著性,2)我们证明了 LGT 可以极大地降低两种成熟的同源性推断方法的准确性。ALF 可作为独立应用程序或通过网络界面在 http://www.cbrg.ethz.ch/alf 获得。