• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于多核集群的改进型距离矩阵计算算法。

An improved distance matrix computation algorithm for multicore clusters.

作者信息

Al-Neama Mohammed W, Reda Naglaa M, Ghaleb Fayed F M

机构信息

Department of Mathematics, Faculty of Science, Al-Azhar University, Cairo, Egypt ; Education College for Girls, Mosul University, Mosul, Iraq.

Department of Mathematics, Faculty of Science, Ain Shams University, Cairo, Egypt.

出版信息

Biomed Res Int. 2014;2014:406178. doi: 10.1155/2014/406178. Epub 2014 Jun 12.

DOI:10.1155/2014/406178
PMID:25013779
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4074972/
Abstract

Distance matrix has diverse usage in different research areas. Its computation is typically an essential task in most bioinformatics applications, especially in multiple sequence alignment. The gigantic explosion of biological sequence databases leads to an urgent need for accelerating these computations. DistVect algorithm was introduced in the paper of Al-Neama et al. (in press) to present a recent approach for vectorizing distance matrix computing. It showed an efficient performance in both sequential and parallel computing. However, the multicore cluster systems, which are available now, with their scalability and performance/cost ratio, meet the need for more powerful and efficient performance. This paper proposes DistVect1 as highly efficient parallel vectorized algorithm with high performance for computing distance matrix, addressed to multicore clusters. It reformulates DistVect1 vectorized algorithm in terms of clusters primitives. It deduces an efficient approach of partitioning and scheduling computations, convenient to this type of architecture. Implementations employ potential of both MPI and OpenMP libraries. Experimental results show that the proposed method performs improvement of around 3-fold speedup upon SSE2. Further it also achieves speedups more than 9 orders of magnitude compared to the publicly available parallel implementation utilized in ClustalW-MPI.

摘要

距离矩阵在不同的研究领域有多种用途。其计算通常是大多数生物信息学应用中的一项基本任务,尤其是在多序列比对中。生物序列数据库的巨大增长导致迫切需要加速这些计算。Al-Neama等人(即将发表)的论文中引入了DistVect算法,以提出一种将距离矩阵计算向量化的最新方法。它在顺序计算和并行计算中都表现出高效的性能。然而,现有的多核集群系统及其可扩展性和性能/成本比,满足了对更强大、更高效性能的需求。本文提出DistVect1作为一种针对多核集群的高效并行向量化算法,用于计算距离矩阵。它根据集群原语重新制定了DistVect1向量化算法。它推导出一种有效的分区和调度计算方法,适用于这种类型的架构。实现采用了MPI和OpenMP库的潜力。实验结果表明,与SSE2相比,该方法的性能提高了约3倍。此外,与ClustalW-MPI中使用的公开可用并行实现相比,它还实现了超过9个数量级的加速。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/d63859cc1e24/BMRI2014-406178.alg.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/9656e005cd37/BMRI2014-406178.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/be57e866fe03/BMRI2014-406178.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/f4c0b0fa0e9d/BMRI2014-406178.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/d73a7c8c1ba3/BMRI2014-406178.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/6513d0e28de0/BMRI2014-406178.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/e1968f611fc8/BMRI2014-406178.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/3f6f96d5283d/BMRI2014-406178.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/1b8a183ac9a1/BMRI2014-406178.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/d07dbf6861ea/BMRI2014-406178.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/d63859cc1e24/BMRI2014-406178.alg.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/9656e005cd37/BMRI2014-406178.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/be57e866fe03/BMRI2014-406178.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/f4c0b0fa0e9d/BMRI2014-406178.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/d73a7c8c1ba3/BMRI2014-406178.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/6513d0e28de0/BMRI2014-406178.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/e1968f611fc8/BMRI2014-406178.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/3f6f96d5283d/BMRI2014-406178.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/1b8a183ac9a1/BMRI2014-406178.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/d07dbf6861ea/BMRI2014-406178.009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3274/4074972/d63859cc1e24/BMRI2014-406178.alg.001.jpg

相似文献

1
An improved distance matrix computation algorithm for multicore clusters.一种用于多核集群的改进型距离矩阵计算算法。
Biomed Res Int. 2014;2014:406178. doi: 10.1155/2014/406178. Epub 2014 Jun 12.
2
Fast and Accurate Multiple Sequence Alignment with MSAProbs-MPI.使用MSAProbs-MPI进行快速准确的多序列比对。
Methods Mol Biol. 2021;2231:39-47. doi: 10.1007/978-1-0716-1036-7_3.
3
Multi-threaded vectorized distance matrix computation on the CELL/BE and x86/SSE2 architectures.在 CELL/BE 和 x86/SSE2 架构上进行多线程向量化距离矩阵计算。
Bioinformatics. 2010 May 15;26(10):1368-9. doi: 10.1093/bioinformatics/btq135. Epub 2010 Mar 26.
4
Parallel pattern identification in biological sequences on clusters.在集群上的生物序列中进行并行模式识别。
IEEE Trans Nanobioscience. 2003 Mar;2(1):29-34. doi: 10.1109/tnb.2003.810165.
5
ClustalXeed: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment.ClustalXeed:一个基于图形界面的网格计算版本,用于高性能和 TB 级大小的多序列比对。
BMC Bioinformatics. 2010 Sep 17;11:467. doi: 10.1186/1471-2105-11-467.
6
ClustalW-MPI: ClustalW analysis using distributed and parallel computing.ClustalW-MPI:使用分布式和并行计算的ClustalW分析。
Bioinformatics. 2003 Aug 12;19(12):1585-6. doi: 10.1093/bioinformatics/btg192.
7
An OpenMP-based tool for finding longest common subsequence in bioinformatics.一种基于OpenMP的生物信息学中查找最长公共子序列的工具。
BMC Res Notes. 2019 Apr 11;12(1):220. doi: 10.1186/s13104-019-4256-6.
8
Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters.基于至强融核集群的大规模生物序列比对并行算法
BMC Bioinformatics. 2016 Jul 19;17 Suppl 9(Suppl 9):267. doi: 10.1186/s12859-016-1128-0.
9
High-speed multiple sequence alignment on a reconfigurable platform.可重构平台上的高速多序列比对
Int J Bioinform Res Appl. 2006;2(4):394-406. doi: 10.1504/IJBRA.2006.011038.
10
Multiple sequence alignment in parallel on a workstation cluster.在工作站集群上并行进行多序列比对。
Bioinformatics. 2004 May 1;20(7):1193-5. doi: 10.1093/bioinformatics/bth055. Epub 2004 Feb 5.

引用本文的文献

1
Antibody Clustering Using a Machine Learning Pipeline that Fuses Genetic, Structural, and Physicochemical Properties.使用融合遗传、结构和物理化学特性的机器学习管道进行抗体聚类。
Adv Exp Med Biol. 2020;1194:41-58. doi: 10.1007/978-3-030-32622-7_4.
2
BugMat and FindNeighbour: command line and server applications for investigating bacterial relatedness.BugMat和FindNeighbour:用于调查细菌亲缘关系的命令行和服务器应用程序。
BMC Bioinformatics. 2017 Nov 13;18(1):477. doi: 10.1186/s12859-017-1907-2.

本文引用的文献

1
MC64-ClustalWP2: a highly-parallel hybrid strategy to align multiple sequences in many-core architectures.MC64-ClustalWP2:一种用于在多核架构中对多个序列进行比对的高度并行混合策略。
PLoS One. 2014 Apr 7;9(4):e94044. doi: 10.1371/journal.pone.0094044. eCollection 2014.
2
CUDASW++ 3.0: accelerating Smith-Waterman protein database search by coupling CPU and GPU SIMD instructions.CUDASW++ 3.0:通过结合 CPU 和 GPU 的 SIMD 指令来加速 Smith-Waterman 蛋白质数据库搜索。
BMC Bioinformatics. 2013 Apr 4;14:117. doi: 10.1186/1471-2105-14-117.
3
Multiscale distance matrix for fast plant leaf recognition.
多尺度距离矩阵用于快速植物叶片识别。
IEEE Trans Image Process. 2012 Nov;21(11):4667-72. doi: 10.1109/TIP.2012.2207391. Epub 2012 Aug 2.
4
ClustalXeed: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment.ClustalXeed:一个基于图形界面的网格计算版本,用于高性能和 TB 级大小的多序列比对。
BMC Bioinformatics. 2010 Sep 17;11:467. doi: 10.1186/1471-2105-11-467.
5
MSAProbs: multiple sequence alignment based on pair hidden Markov models and partition function posterior probabilities.MSAProbs:基于对隐马尔可夫模型和分区函数后验概率的多重序列比对。
Bioinformatics. 2010 Aug 15;26(16):1958-64. doi: 10.1093/bioinformatics/btq338. Epub 2010 Jun 23.
6
Multi-threaded vectorized distance matrix computation on the CELL/BE and x86/SSE2 architectures.在 CELL/BE 和 x86/SSE2 架构上进行多线程向量化距离矩阵计算。
Bioinformatics. 2010 May 15;26(10):1368-9. doi: 10.1093/bioinformatics/btq135. Epub 2010 Mar 26.
7
The Ribosomal Database Project: improved alignments and new tools for rRNA analysis.核糖体数据库项目:改进的比对方法及用于rRNA分析的新工具。
Nucleic Acids Res. 2009 Jan;37(Database issue):D141-5. doi: 10.1093/nar/gkn879. Epub 2008 Nov 12.
8
DIALIGN-TX: greedy and progressive approaches for segment-based multiple sequence alignment.DIALIGN-TX:基于片段的多序列比对的贪心与渐进方法。
Algorithms Mol Biol. 2008 May 27;3:6. doi: 10.1186/1748-7188-3-6.
9
MAFFT version 5: improvement in accuracy of multiple sequence alignment.MAFFT 5 版本:多重序列比对准确性的提升。
Nucleic Acids Res. 2005 Jan 20;33(2):511-8. doi: 10.1093/nar/gki198. Print 2005.
10
MUSCLE: multiple sequence alignment with high accuracy and high throughput.MUSCLE:具有高精度和高吞吐量的多序列比对。
Nucleic Acids Res. 2004 Mar 19;32(5):1792-7. doi: 10.1093/nar/gkh340. Print 2004.