Suppr超能文献

一种对齐置信度评分,可捕捉对引导树不确定性的稳健性。

An alignment confidence score capturing robustness to guide tree uncertainty.

机构信息

Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel.

出版信息

Mol Biol Evol. 2010 Aug;27(8):1759-67. doi: 10.1093/molbev/msq066. Epub 2010 Mar 5.

Abstract

Multiple sequence alignment (MSA) is the basis for a wide range of comparative sequence analyses from molecular phylogenetics to 3D structure prediction. Sophisticated algorithms have been developed for sequence alignment, but in practice, many errors can be expected and extensive portions of the MSA are unreliable. Hence, it is imperative to understand and characterize the various sources of errors in MSAs and to quantify site-specific alignment confidence. In this paper, we show that uncertainties in the guide tree used by progressive alignment methods are a major source of alignment uncertainty. We use this insight to develop a novel method for quantifying the robustness of each alignment column to guide tree uncertainty. We build on the widely used bootstrap method for perturbing the phylogenetic tree. Specifically, we generate a collection of trees and use each as a guide tree in the alignment algorithm, thus producing a set of MSAs. We next test the consistency of every column of the MSA obtained from the unperturbed guide tree with respect to the set of MSAs. We name this measure the "GUIDe tree based AligNment ConfidencE" (GUIDANCE) score. Using the Benchmark Alignment data BASE benchmark as well as simulation studies, we show that GUIDANCE scores accurately identify errors in MSAs. Additionally, we compare our results with the previously published Heads-or-Tails score and show that the GUIDANCE score is a better predictor of unreliably aligned regions.

摘要

多序列比对 (MSA) 是从分子系统发育学到 3D 结构预测等各种比较序列分析的基础。已经开发出了用于序列比对的复杂算法,但实际上,可能会出现许多错误,并且 MSA 的很大一部分是不可靠的。因此,理解和描述 MSA 中各种错误源并量化特定位置的比对置信度至关重要。在本文中,我们表明渐进比对方法中使用的引导树的不确定性是比对不确定性的主要来源。我们利用这一见解开发了一种新方法来量化每个比对列对引导树不确定性的稳健性。我们基于广泛使用的对系统发育树进行扰动的自举方法。具体来说,我们生成一组树,并将每个树用作比对算法中的引导树,从而生成一组 MSA。接下来,我们测试从未扰动引导树获得的 MSA 中每列相对于 MSA 集合的一致性。我们将此度量命名为“基于 GUIDE 树的对齐置信度”(GUIDANCE)评分。使用基准比对数据 BASE 基准以及模拟研究,我们表明 GUIDANCE 评分可以准确识别 MSA 中的错误。此外,我们将结果与之前发表的 Heads-or-Tails 评分进行比较,并表明 GUIDANCE 评分是不可靠比对区域的更好预测指标。

相似文献

4
GUIDANCE: a web server for assessing alignment confidence scores.GUIDANCE:一个评估比对置信分数的网络服务器。
Nucleic Acids Res. 2010 Jul;38(Web Server issue):W23-8. doi: 10.1093/nar/gkq443. Epub 2010 May 23.

引用本文的文献

3
Comparative Evolutionary Genomics in Insects.昆虫比较进化基因组学。
Methods Mol Biol. 2024;2802:473-514. doi: 10.1007/978-1-0716-3838-5_16.
4
A Guide to Phylogenomic Inference.系统发育基因组推断指南。
Methods Mol Biol. 2024;2802:267-345. doi: 10.1007/978-1-0716-3838-5_11.

本文引用的文献

2
Fast statistical alignment.快速统计对齐
PLoS Comput Biol. 2009 May;5(5):e1000392. doi: 10.1371/journal.pcbi.1000392. Epub 2009 May 29.
3
INDELible: a flexible simulator of biological sequence evolution.INDELible:一款灵活的生物序列进化模拟器。
Mol Biol Evol. 2009 Aug;26(8):1879-88. doi: 10.1093/molbev/msp098. Epub 2009 May 7.
4
Characterization of pairwise and multiple sequence alignment errors.成对和多序列比对错误的特征描述。
Gene. 2009 Jul 15;441(1-2):141-7. doi: 10.1016/j.gene.2008.05.016. Epub 2008 Jun 3.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验