Suchard Marc A
Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, 90095-1766, USA.
Genetics. 2005 May;170(1):419-31. doi: 10.1534/genetics.103.025692. Epub 2005 Mar 21.
Horizontal gene transfer (HGT) plays a critical role in evolution across all domains of life with important biological and medical implications. I propose a simple class of stochastic models to examine HGT using multiple orthologous gene alignments. The models function in a hierarchical phylogenetic framework. The top level of the hierarchy is based on a random walk process in "tree space" that allows for the development of a joint probabilistic distribution over multiple gene trees and an unknown, but estimable species tree. I consider two general forms of random walks. The first form is derived from the subtree prune and regraft (SPR) operator that mirrors the observed effects that HGT has on inferred trees. The second form is based on walks over complete graphs and offers numerically tractable solutions for an increasing number of taxa. The bottom level of the hierarchy utilizes standard phylogenetic models to reconstruct gene trees given multiple gene alignments conditional on the random walk process. I develop a well-mixing Markov chain Monte Carlo algorithm to fit the models in a Bayesian framework. I demonstrate the flexibility of these stochastic models to test competing ideas about HGT by examining the complexity hypothesis. Using 144 orthologous gene alignments from six prokaryotes previously collected and analyzed, Bayesian model selection finds support for (1) the SPR model over the alternative form, (2) the 16S rRNA reconstruction as the most likely species tree, and (3) increased HGT of operational genes compared to informational genes.
水平基因转移(HGT)在生命的所有领域的进化中都起着关键作用,具有重要的生物学和医学意义。我提出了一类简单的随机模型,使用多个直系同源基因比对来研究HGT。这些模型在分层系统发育框架中起作用。层次结构的顶层基于“树空间”中的随机游走过程,该过程允许在多个基因树和一个未知但可估计的物种树上开发联合概率分布。我考虑两种一般形式的随机游走。第一种形式源自子树修剪和重新嫁接(SPR)算子,它反映了HGT对推断树的观察到的影响。第二种形式基于完全图上的游走,并为越来越多的分类群提供数值上易于处理的解决方案。层次结构的底层利用标准系统发育模型,在随机游走过程的条件下,根据多个基因比对重建基因树。我开发了一种混合良好的马尔可夫链蒙特卡罗算法,以在贝叶斯框架中拟合模型。通过检验复杂性假设,我展示了这些随机模型在测试关于HGT的相互竞争观点方面的灵活性。使用先前收集和分析的来自六种原核生物的144个直系同源基因比对,贝叶斯模型选择支持以下几点:(1)SPR模型优于替代形式;(2)16S rRNA重建作为最可能的物种树;(3)与信息基因相比,操作基因的HGT增加。