Suppr超能文献

基于多级并行的快速噪声长读比对

Fast noisy long read alignment with multi-level parallelism.

作者信息

Xia Zeyu, Yang Canqun, Peng Chenchen, Guo Yifei, Guo Yufei, Tang Tao, Cui Yingbo

机构信息

College of Computer Science and Technology, National University of Defense Technology, 410073, Changsha, China.

National Supercomputer Center in Tianjin, 300457, Tianjin, China.

出版信息

BMC Bioinformatics. 2025 May 2;26(1):118. doi: 10.1186/s12859-025-06129-w.

Abstract

BACKGROUND

The advent of Single Molecule Real-Time (SMRT) sequencing has overcome many limitations of second-generation sequencing, such as limited read lengths, PCR amplification biases. However, longer reads increase data volume exponentially and high error rates make many existing alignment tools inapplicable. Additionally, a single CPU's performance bottleneck restricts the effectiveness of alignment algorithms for SMRT sequencing.

RESULTS

To address these challenges, we introduce ParaHAT, a parallel alignment algorithm for noisy long reads. ParaHAT utilizes vector-level, thread-level, process-level, and heterogeneous parallelism. We redesign the dynamic programming matrices layouts to eliminate data dependency in the base-level alignment, enabling effective vectorization. We further enhance computational speed through heterogeneous parallel technology and implement the algorithm for multi-node computing using MPI, overcoming the computational limits of a single node.

CONCLUSIONS

Performance evaluations show that ParaHAT got a 10.03x speedup in base-level alignment, with a parallel acceleration ratio and weak scalability metric of 94.61 and 98.98% on 128 nodes, respectively.

摘要

背景

单分子实时(SMRT)测序技术的出现克服了第二代测序技术的许多局限性,如读长有限、PCR扩增偏差等。然而,更长的读长会使数据量呈指数级增长,且错误率高使得许多现有的比对工具无法适用。此外,单个CPU的性能瓶颈限制了针对SMRT测序的比对算法的有效性。

结果

为应对这些挑战,我们引入了ParaHAT,一种用于有噪声长读段的并行比对算法。ParaHAT利用向量级、线程级、进程级和异构并行性。我们重新设计了动态规划矩阵布局,以消除碱基级比对中的数据依赖性,实现有效的向量化。我们通过异构并行技术进一步提高计算速度,并使用MPI实现了多节点计算算法,克服了单个节点的计算限制。

结论

性能评估表明,ParaHAT在碱基级比对中实现了10.03倍的加速,在128个节点上的并行加速比和弱可扩展性指标分别为94.61和98.98%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/81b2/12049014/d01513e426d6/12859_2025_6129_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验