Minin Vladimir N, Dorman Karin S, Fang Fang, Suchard Marc A
Department of Biomathematics, David Geffen School of Medicine, University of California Los Angeles, CA 90095-1766, USA.
Bioinformatics. 2005 Jul 1;21(13):3034-42. doi: 10.1093/bioinformatics/bti459. Epub 2005 May 24.
We introduce a dual multiple change-point (MCP) model for recombination detection among aligned nucleotide sequences. The dual MCP model is an extension of the model introduced previously by Suchard and co-workers. In the original single MCP model, one change-point process is used to model spatial phylogenetic variation. Here, we show that using two change-point processes, one for spatial variation of tree topologies and the other for spatial variation of substitution process parameters, increases recombination detection accuracy. Statistical analysis is done in a Bayesian framework using reversible jump Markov chain Monte Carlo sampling to approximate the joint posterior distribution of all model parameters.
We use primate mitochondrial DNA data with simulated recombination break-points at specific locations to compare the two models. We also analyze two real HIV sequences to identify recombination break-points using the dual MCP model.
我们引入了一种用于在比对的核苷酸序列中检测重组的双多重变化点(MCP)模型。双MCP模型是Suchard及其同事之前引入的模型的扩展。在原始的单MCP模型中,一个变化点过程用于对空间系统发育变异进行建模。在这里,我们表明使用两个变化点过程,一个用于树拓扑结构的空间变异,另一个用于替换过程参数的空间变异,可以提高重组检测的准确性。统计分析是在贝叶斯框架下进行的,使用可逆跳跃马尔可夫链蒙特卡罗抽样来近似所有模型参数的联合后验分布。
我们使用在特定位置具有模拟重组断点的灵长类动物线粒体DNA数据来比较这两个模型。我们还使用双MCP模型分析了两条真实的HIV序列以识别重组断点。