Holmes Ian H
Dept of Bioengineering, University of California, Berkeley, 94720, USA.
BMC Bioinformatics. 2017 May 12;18(1):255. doi: 10.1186/s12859-017-1665-1.
Despite the long-anticipated possibility of putting sequence alignment on the same footing as statistical phylogenetics, theorists have struggled to develop time-dependent evolutionary models for indels that are as tractable as the analogous models for substitution events.
This paper discusses progress in the area of insertion-deletion models, in view of recent work by Ezawa (BMC Bioinformatics 17:304, 2016); (BMC Bioinformatics 17:397, 2016); (BMC Bioinformatics 17:457, 2016) on the calculation of time-dependent gap length distributions in pairwise alignments, and current approaches for extending these approaches from ancestor-descendant pairs to phylogenetic trees.
While approximations that use finite-state machines (Pair HMMs and transducers) currently represent the most practical approach to problems such as sequence alignment and phylogeny, more rigorous approaches that work directly with the matrix exponential of the underlying continuous-time Markov chain also show promise, especially in view of recent advances.
尽管人们早就期待着能将序列比对与统计系统发育学置于同等地位,但理论学家们一直在努力开发与替换事件的类似模型一样易于处理的、依赖时间的插入缺失进化模型。
鉴于江泽最近在成对序列比对中依赖时间的空位长度分布计算方面的工作(《BMC生物信息学》17:304,2016年);(《BMC生物信息学》17:397,2016年);(《BMC生物信息学》17:457,2016年),以及将这些方法从祖先-后代对扩展到系统发育树的当前方法,本文讨论了插入缺失模型领域的进展。
虽然使用有限状态机(成对隐马尔可夫模型和换能器)的近似方法目前是解决序列比对和系统发育等问题最实用的方法,但直接处理基础连续时间马尔可夫链的矩阵指数的更严格方法也显示出前景,特别是鉴于最近的进展。