Covidien, 60 Middletown Avenue, North Haven, CT 06473, USA.
Bioinformatics. 2011 Jan 1;27(1):31-7. doi: 10.1093/bioinformatics/btq621. Epub 2010 Nov 24.
Sequence alignment is one of the most popular tools of modern biology. NCBI's PSI-BLAST utilizes iterative model building in order to better detect distant homologs with greater sensitivity than non-iterative BLAST. However, PSI-BLAST's performance is limited by the fact that it relies on deterministic alignments. Using a semi-probabilistic alignment scheme such as Hybrid alignment should allow for better informed model building and improved identification of homologous sequences, particularly remote homologs.
We have built a new version of the tool in which the Smith-Waterman alignment algorithm core is replaced by the hybrid alignment algorithm. The favorable statistical properties of the hybrid algorithm allow the introduction of position-specific gap penalties in Hybrid PSI-BLAST. This improves the position-specific modeling of protein families and results in an overall improvement of performance.
Source code is freely available for download at http://bioserv.mps.ohio-state.edu/HybridPSI, implemented in C and supported on linux.
序列比对是现代生物学中最流行的工具之一。 NCBI 的 PSI-BLAST 利用迭代模型构建,以便比非迭代 BLAST 更敏感地检测到更远的同源物。但是,PSI-BLAST 的性能受到其依赖确定性比对的事实的限制。使用半概率比对方案(如 Hybrid 比对)应该可以进行更好的信息模型构建,并改进同源序列(尤其是远程同源物)的识别。
我们构建了一个新版本的工具,其中替换了 Smith-Waterman 比对算法核心的 Hybrid 比对算法。Hybrid 算法的有利统计特性允许在 Hybrid PSI-BLAST 中引入位置特定的空位罚分。这改进了蛋白质家族的位置特定建模,并导致整体性能的提高。
源代码可在 http://bioserv.mps.ohio-state.edu/HybridPSI 上免费下载,用 C 语言实现,支持 Linux。