Suppr超能文献

ReAlign-N:一种用于多核酸序列比对的综合重排方法,结合了全局和局部重排。

ReAlign-N: an integrated realignment approach for multiple nucleic acid sequence alignment, combining global and local realignments.

作者信息

Zhai Yixiao, Zhou Tong, Wei Yanming, Zou Quan, Wang Yansu

机构信息

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No.2006, Xiyuan Avenue, Pidu Zone, Chengdu 610054, China.

Institute of Digital Health, Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, No.1, Chengdian Road, Kecheng Zone, Quzhou 324003, China.

出版信息

NAR Genom Bioinform. 2024 Dec 18;6(4):lqae170. doi: 10.1093/nargab/lqae170. eCollection 2024 Dec.

Abstract

Ensuring accurate multiple sequence alignment (MSA) is essential for comprehensive biological sequence analysis. However, the complexity of evolutionary relationships often results in variations that generic alignment tools may not adequately address. Realignment is crucial to remedy this issue. Currently, there is a lack of realignment methods tailored for nucleic acid sequences, particularly for lengthy sequences. Thus, there's an urgent need for the development of realignment methods better suited to address these challenges. This study presents ReAlign-N, a realignment method explicitly designed for multiple nucleic acid sequence alignment. ReAlign-N integrates both global and local realignment strategies for improved accuracy. In the global realignment phase, ReAlign-N incorporates K-Band and innovative memory-saving technology into the dynamic programming approach, ensuring high efficiency and minimal memory requirements for large-scale realignment tasks. The local realignment stage employs full matching and entropy scoring methods to identify low-quality regions and conducts realignment through MAFFT. Experimental results demonstrate that ReAlign-N consistently outperforms initial alignments on simulated and real datasets. Furthermore, compared to ReformAlign, the only existing multiple nucleic acid sequence realignment tool, ReAlign-N, exhibits shorter running times and occupies less memory space. The source code and test data for ReAlign-N are available on GitHub (https://github.com/malabz/ReAlign-N).

摘要

确保准确的多序列比对(MSA)对于全面的生物序列分析至关重要。然而,进化关系的复杂性常常导致通用比对工具可能无法充分解决的变异。重新比对对于解决这个问题至关重要。目前,缺乏专门针对核酸序列,特别是长序列的重新比对方法。因此,迫切需要开发更适合应对这些挑战的重新比对方法。本研究提出了ReAlign-N,一种专门为多个核酸序列比对设计的重新比对方法。ReAlign-N整合了全局和局部重新比对策略以提高准确性。在全局重新比对阶段,ReAlign-N将K波段和创新的内存节省技术纳入动态规划方法,确保大规模重新比对任务的高效率和最小内存需求。局部重新比对阶段采用完全匹配和熵评分方法来识别低质量区域,并通过MAFFT进行重新比对。实验结果表明,ReAlign-N在模拟和真实数据集上始终优于初始比对。此外,与现有的唯一多个核酸序列重新比对工具ReformAlign相比,ReAlign-N运行时间更短,占用内存空间更少。ReAlign-N的源代码和测试数据可在GitHub(https://github.com/malabz/ReAlign-N)上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8481/11655299/4e1f7f8bf65c/lqae170fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验