Suppr超能文献

自动检测多序列比对的锚点。

Automatic detection of anchor points for multiple sequence alignment.

机构信息

Partner Institute for Computational Biology, CAS-MPG, 320 Yue Yang Rd, 200031 Shanghai, China.

出版信息

BMC Bioinformatics. 2010 Sep 2;11:445. doi: 10.1186/1471-2105-11-445.

Abstract

BACKGROUND

determining beforehand specific positions to align (anchor points) has proved valuable for the accuracy of automated multiple sequence alignment (MSA) software. This feature can be used manually to include biological expertise, or automatically, usually by pairwise similarity searches. Multiple local similarities are be expected to be more adequate, as more biologically relevant. However, even good multiple local similarities can prove incompatible with the ordering of an alignment.

RESULTS

we use a recently developed algorithm to detect multiple local similarities, which returns subsets of positions in the sequences sharing similar contexts of appearence. In this paper, we describe first how to get, with the help of this method, subsets of positions that could form partial columns in an alignment. We introduce next a graph-theoretic algorithm to detect (and remove) positions in the partial columns that are inconsistent with a multiple alignment. Partial columns can be used, for the time being, as guide only by a few MSA programs: ClustalW 2.0, DIALIGN 2 and T-Coffee. We perform tests on the effect of introducing these columns on the popular benchmark BAliBASE 3.

CONCLUSIONS

we show that the inclusion of our partial alignment columns, as anchor points, improve on the whole the accuracy of the aligner ClustalW on the benchmark BAliBASE 3.

摘要

背景

预先确定特定的对齐位置(锚点)已被证明对自动多序列对齐(MSA)软件的准确性很有价值。此功能可以手动使用,包括生物学专业知识,也可以自动使用,通常通过两两相似性搜索。多个局部相似性预计更合适,因为它们与生物学相关性更强。然而,即使是良好的多个局部相似性也可能与对齐的排序不兼容。

结果

我们使用最近开发的算法来检测多个局部相似性,该算法返回序列中共享相似出现上下文的位置子集。在本文中,我们首先描述了如何借助该方法获得可能在对齐中形成部分列的位置子集。接下来,我们引入了一种图论算法来检测(并删除)部分列中与多对齐不一致的位置。暂时可以将部分列仅用作少数 MSA 程序的指导:ClustalW 2.0、DIALIGN 2 和 T-Coffee。我们在流行的基准 BAliBASE 3 上测试了引入这些列对对齐程序 ClustalW 的准确性的影响。

结论

我们表明,将我们的部分对齐列作为锚点包含在内,可以提高 ClustalW 在 BAliBASE 3 基准测试上的整体准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0333/2942857/1a1feae8685d/1471-2105-11-445-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验