在英特尔众核（MIC）架构上高效计算基序发现。

Efficient computation of motif discovery on Intel Many Integrated Core (MIC) Architecture.

机构信息

College of Computer Science and Electronic Engineering & National Supercomputing Centre in Changsha, Hunan University, Changsha, 410082, China.

School of Computer Science, National University of Defense Technology, Changsha, 410073, China.

出版信息

BMC Bioinformatics. 2018 Aug 13;19(Suppl 9):282. doi: 10.1186/s12859-018-2276-1.

DOI:10.1186/s12859-018-2276-1

PMID:30367570

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6101076/

Abstract

BACKGROUND

Novel sequence motifs detection is becoming increasingly essential in computational biology. However, the high computational cost greatly constrains the efficiency of most motif discovery algorithms.

RESULTS

In this paper, we accelerate MEME algorithm targeted on Intel Many Integrated Core (MIC) Architecture and present a parallel implementation of MEME called MIC-MEME base on hybrid CPU/MIC computing framework. Our method focuses on parallelizing the starting point searching method and improving iteration updating strategy of the algorithm. MIC-MEME has achieved significant speedups of 26.6 for ZOOPS model and 30.2 for OOPS model on average for the overall runtime when benchmarked on the experimental platform with two Xeon Phi 3120 coprocessors.

CONCLUSIONS

Furthermore, MIC-MEME has been compared with state-of-arts methods and it shows good scalability with respect to dataset size and the number of MICs. Source code: https://github.com/hkwkevin28/MIC-MEME .

摘要

背景

新的序列基序检测在计算生物学中变得越来越重要。然而，高计算成本极大地限制了大多数基序发现算法的效率。

结果

在本文中，我们针对 Intel Many Integrated Core (MIC) 架构加速了 MEME 算法，并基于混合 CPU/MIC 计算框架提出了 MEME 的并行实现，称为 MIC-MEME。我们的方法专注于并行化算法的起始点搜索方法和改进迭代更新策略。在具有两个 Xeon Phi 3120 协处理器的实验平台上进行基准测试时，MIC-MEME 分别在 ZOOPS 模型上实现了 26.6 的平均整体运行时加速和在 OOPS 模型上实现了 30.2 的平均整体运行时加速。

结论

此外，MIC-MEME 已经与最先进的方法进行了比较，并且在数据集大小和 MIC 数量方面具有良好的可扩展性。源代码：https://github.com/hkwkevin28/MIC-MEME。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/86d6/6101076/c2d988981fe9/12859_2018_2276_Fig1_HTML.jpg

相似文献

Efficient computation of motif discovery on Intel Many Integrated Core (MIC) Architecture.在英特尔众核（MIC）架构上高效计算基序发现。

BMC Bioinformatics. 2018 Aug 13;19(Suppl 9):282. doi: 10.1186/s12859-018-2276-1.

Heterogeneous computing architecture for fast detection of SNP-SNP interactions.用于快速检测 SNP-SNP 相互作用的异构计算架构。

BMC Bioinformatics. 2014 Jun 25;15:216. doi: 10.1186/1471-2105-15-216.

A CPU/MIC Collaborated Parallel Framework for GROMACS on Tianhe-2 Supercomputer.天河 2 号超级计算机上的 GROMACS 的 CPU/MIC 协作并行框架。

IEEE/ACM Trans Comput Biol Bioinform. 2019 Mar-Apr;16(2):425-433. doi: 10.1109/TCBB.2017.2713362. Epub 2017 Jun 16.

MCtandem: an efficient tool for large-scale peptide identification on many integrated core (MIC) architecture.MCtandem：一种在许多集成核心 (MIC) 架构上进行大规模肽鉴定的高效工具。

BMC Bioinformatics. 2019 Jul 17;20(1):397. doi: 10.1186/s12859-019-2980-5.

MEME SUITE: tools for motif discovery and searching.MEME套件：用于基序发现和搜索的工具。

Nucleic Acids Res. 2009 Jul;37(Web Server issue):W202-8. doi: 10.1093/nar/gkp335. Epub 2009 May 20.

Parallel algorithms for large-scale biological sequence alignment on Xeon-Phi based clusters.基于至强融核集群的大规模生物序列比对并行算法

BMC Bioinformatics. 2016 Jul 19;17 Suppl 9(Suppl 9):267. doi: 10.1186/s12859-016-1128-0.

EXTREME: an online EM algorithm for motif discovery.极端：一种用于基序发现的在线 EM 算法。

Bioinformatics. 2014 Jun 15;30(12):1667-73. doi: 10.1093/bioinformatics/btu093. Epub 2014 Feb 14.

YAMDA: thousandfold speedup of EM-based motif discovery using deep learning libraries and GPU.YAMDA：使用深度学习库和 GPU 将基于 EM 的 motif 发现速度提高 1000 倍。

Bioinformatics. 2018 Oct 15;34(20):3578-3580. doi: 10.1093/bioinformatics/bty396.

The value of position-specific priors in motif discovery using MEME.MEME 中位置特异性先验在基序发现中的价值。

BMC Bioinformatics. 2010 Apr 9;11:179. doi: 10.1186/1471-2105-11-179.

An Algorithm for Motif Discovery with Iteration on Lengths of Motifs.一种基于基序长度迭代的基序发现算法。

IEEE/ACM Trans Comput Biol Bioinform. 2015 Jan-Feb;12(1):136-41. doi: 10.1109/TCBB.2014.2351793.

引用本文的文献

Grx4, Fep1, and Php4: analysis and expression response to different iron concentrations.Grx4、Fep1和Php4：对不同铁浓度的分析及表达反应

Front Genet. 2022 Dec 7;13:1069068. doi: 10.3389/fgene.2022.1069068. eCollection 2022.

HSMotifDiscover: identification of motifs in sequences composed of non-single-letter elements.HSMotifDiscover：识别由非单字母元素组成的序列中的基序。

Bioinformatics. 2022 Aug 10;38(16):4036-4038. doi: 10.1093/bioinformatics/btac437.

Basis for using thioredoxin as an electron donor by Schizosaccharomyces pombe Gpx1 and Tpx1.粟酒裂殖酵母Gpx1和Tpx1将硫氧还蛋白用作电子供体的依据。

AMB Express. 2022 Apr 11;12(1):41. doi: 10.1186/s13568-022-01381-2.

Perspectives of Bioinformatics in Big Data Era.大数据时代的生物信息学展望

Curr Genomics. 2019 Feb;20(2):79-80.

BMC Bioinformatics. 2019 Jul 17;20(1):397. doi: 10.1186/s12859-019-2980-5.

本文引用的文献

Pretata: predicting TATA binding proteins with novel features and dimensionality reduction strategy.Pretata：运用新特征和降维策略预测TATA结合蛋白

BMC Syst Biol. 2016 Dec 23;10(Suppl 4):114. doi: 10.1186/s12918-016-0353-5.

EXTREME: an online EM algorithm for motif discovery.极端：一种用于基序发现的在线 EM 算法。

Bioinformatics. 2014 Jun 15;30(12):1667-73. doi: 10.1093/bioinformatics/btu093. Epub 2014 Feb 14.

An integrated toolkit for accurate prediction and analysis of cis-regulatory motifs at a genome scale.一个基因组范围内精确预测和分析顺式调控基序的综合工具包。

Bioinformatics. 2013 Sep 15;29(18):2261-8. doi: 10.1093/bioinformatics/btt397. Epub 2013 Jul 10.

A compendium of RNA-binding motifs for decoding gene regulation.RNA 结合基序手册：解码基因调控

Nature. 2013 Jul 11;499(7457):172-7. doi: 10.1038/nature12311.

Survey of MapReduce frame operation in bioinformatics.生物信息学中MapReduce框架操作的调查。

Brief Bioinform. 2014 Jul;15(4):637-47. doi: 10.1093/bib/bbs088. Epub 2013 Feb 7.

SATe-II: very fast and accurate simultaneous estimation of multiple sequence alignments and phylogenetic trees.SATe-II：一种非常快速且准确的同时估计多个序列比对和系统发育树的方法。

Syst Biol. 2012 Jan;61(1):90-106. doi: 10.1093/sysbio/syr095. Epub 2011 Dec 1.

MEME SUITE: tools for motif discovery and searching.MEME套件：用于基序发现和搜索的工具。

Nucleic Acids Res. 2009 Jul;37(Web Server issue):W202-8. doi: 10.1093/nar/gkp335. Epub 2009 May 20.

A survey of DNA motif finding algorithms.DNA基序查找算法综述。

BMC Bioinformatics. 2007 Nov 1;8 Suppl 7(Suppl 7):S21. doi: 10.1186/1471-2105-8-S7-S21.

MEME: discovering and analyzing DNA and protein sequence motifs.MEME：发现和分析DNA与蛋白质序列基序

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W369-73. doi: 10.1093/nar/gkl198.

Limitations and potentials of current motif discovery algorithms.当前基序发现算法的局限性与潜力。

Nucleic Acids Res. 2005 Sep 2;33(15):4899-913. doi: 10.1093/nar/gki791. Print 2005.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在英特尔众核（MIC）架构上高效计算基序发现。

Efficient computation of motif discovery on Intel Many Integrated Core (MIC) Architecture.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献