Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W23-8. doi: 10.1093/nar/gkq443. Epub 2010 May 23.
Evaluating the accuracy of multiple sequence alignment (MSA) is critical for virtually every comparative sequence analysis that uses an MSA as input. Here we present the GUIDANCE web-server, a user-friendly, open access tool for the identification of unreliable alignment regions. The web-server accepts as input a set of unaligned sequences. The server aligns the sequences and provides a simple graphic visualization of the confidence score of each column, residue and sequence of an alignment, using a color-coding scheme. The method is generic and the user is allowed to choose the alignment algorithm (ClustalW, MAFFT and PRANK are supported) as well as any type of molecular sequences (nucleotide, protein or codon sequences). The server implements two different algorithms for evaluating confidence scores: (i) the heads-or-tails (HoT) method, which measures alignment uncertainty due to co-optimal solutions; (ii) the GUIDANCE method, which measures the robustness of the alignment to guide-tree uncertainty. The server projects the confidence scores onto the MSA and points to columns and sequences that are unreliably aligned. These can be automatically removed in preparation for downstream analyses. GUIDANCE is freely available for use at http://guidance.tau.ac.il.
评估多重序列比对 (MSA) 的准确性对于几乎所有使用 MSA 作为输入的比较序列分析都是至关重要的。在这里,我们介绍了 GUIDANCE 网络服务器,这是一个用户友好的、开放获取的工具,用于识别不可靠的对齐区域。该网络服务器接受一组未对齐的序列作为输入。服务器对序列进行比对,并使用颜色编码方案提供对齐中每列、残基和序列置信得分的简单图形化可视化。该方法是通用的,用户可以选择对齐算法(支持 ClustalW、MAFFT 和 PRANK)以及任何类型的分子序列(核苷酸、蛋白质或密码子序列)。该服务器实现了两种不同的评估置信得分的算法:(i)heads-or-tails (HoT) 方法,用于测量由于共最佳解决方案导致的对齐不确定性;(ii)GUIDANCE 方法,用于测量对齐对引导树不确定性的稳健性。服务器将置信得分投射到 MSA 上,并指出不可靠对齐的列和序列。这些可以在准备下游分析时自动删除。GUIDANCE 可在 http://guidance.tau.ac.il 上免费使用。