David R Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada.
BMC Bioinformatics. 2011 May 17;12:168. doi: 10.1186/1471-2105-12-168.
Identifying recombinations in HIV is important for studying the epidemiology of the virus and aids in the design of potential vaccines and treatments. The previous widely-used tool for this task uses the Viterbi algorithm in a hidden Markov model to model recombinant sequences.
We apply a new decoding algorithm for this HMM that improves prediction accuracy. Exactly locating breakpoints is usually impossible, since different subtypes are highly conserved in some sequence regions. Our algorithm identifies these sites up to a certain error tolerance. Our new algorithm is more accurate in predicting the location of recombination breakpoints. Our implementation of the algorithm is available at http://www.cs.uwaterloo.ca/~jmtruszk/jphmm_balls.tar.gz.
By explicitly accounting for uncertainty in breakpoint positions, our algorithm offers more reliable predictions of recombination breakpoints in HIV-1. We also document a new domain of use for our new decoding approach in HMMs.
识别 HIV 中的重组对于研究病毒的流行病学以及帮助设计潜在的疫苗和治疗方法非常重要。之前广泛使用的用于此任务的工具是使用隐马尔可夫模型中的维特比算法来对重组序列进行建模。
我们为此 HMM 应用了一种新的解码算法,该算法提高了预测准确性。由于不同的亚型在某些序列区域高度保守,因此通常不可能准确地定位断点。我们的算法可以在一定的误差容限内识别这些位置。我们的新算法在预测重组断点的位置方面更准确。我们的算法实现可在 http://www.cs.uwaterloo.ca/~jmtruszk/jphmm_balls.tar.gz 获得。
通过明确考虑断点位置的不确定性,我们的算法为 HIV-1 中的重组断点提供了更可靠的预测。我们还记录了我们的新解码方法在 HMM 中的一个新的用途领域。