Suppr超能文献

ParBiBit:用于现代分布式内存系统上的二进制分块聚类的并行工具。

ParBiBit: Parallel tool for binary biclustering on modern distributed-memory systems.

机构信息

Grupo de Arquitectura de Computadores, Universidade da Coruña, A Coruña, Spain.

出版信息

PLoS One. 2018 Apr 2;13(4):e0194361. doi: 10.1371/journal.pone.0194361. eCollection 2018.

Abstract

Biclustering techniques are gaining attention in the analysis of large-scale datasets as they identify two-dimensional submatrices where both rows and columns are correlated. In this work we present ParBiBit, a parallel tool to accelerate the search of interesting biclusters on binary datasets, which are very popular on different fields such as genetics, marketing or text mining. It is based on the state-of-the-art sequential Java tool BiBit, which has been proved accurate by several studies, especially on scenarios that result on many large biclusters. ParBiBit uses the same methodology as BiBit (grouping the binary information into patterns) and provides the same results. Nevertheless, our tool significantly improves performance thanks to an efficient implementation based on C++11 that includes support for threads and MPI processes in order to exploit the compute capabilities of modern distributed-memory systems, which provide several multicore CPU nodes interconnected through a network. Our performance evaluation with 18 representative input datasets on two different eight-node systems shows that our tool is significantly faster than the original BiBit. Source code in C++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/parbibit/.

摘要

双聚类技术在分析大规模数据集时越来越受到关注,因为它们可以识别出行和列都相关的二维子矩阵。在这项工作中,我们提出了 ParBiBit,这是一种用于加速二进制数据集上有趣双聚类搜索的并行工具,它在遗传学、市场营销或文本挖掘等不同领域非常流行。它基于已被多项研究证明准确的最新的顺序 Java 工具 BiBit,特别是在会产生许多大型双聚类的场景中。ParBiBit 使用与 BiBit 相同的方法(将二进制信息分组为模式),并提供相同的结果。然而,由于我们的工具是基于 C++11 的高效实现,包括对线程和 MPI 进程的支持,以利用现代分布式内存系统的计算能力,这些系统提供了通过网络连接的多个多核 CPU 节点,因此性能得到了显著提高。我们在两个不同的 8 节点系统上使用 18 个代表性输入数据集进行的性能评估表明,我们的工具比原始 BiBit 快得多。C++和 MPI 的源代码可在 Linux 系统上运行,并提供参考手册,可在 https://sourceforge.net/projects/parbibit/ 上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3dd0/5880350/2719d97e97cd/pone.0194361.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验