Jožef Stefan Institute , Jamova 39, SI-1000 Ljubljana, Slovenia.
J Chem Inf Model. 2013 Sep 23;53(9):2217-28. doi: 10.1021/ci4002525. Epub 2013 Sep 5.
A new exact parallel maximum clique algorithm MaxCliquePara, which finds the maximum clique (the fully connected subgraph) in undirected general and protein graphs, is presented. First, a new branch and bound algorithm for finding a maximum clique on a single computer core, which builds on ideas presented in two published state of the art sequential algorithms is implemented. The new sequential MaxCliqueSeq algorithm is faster than the reference algorithms on both DIMACS benchmark graphs as well as on protein-derived product graphs used for protein structural comparisons. Next, the MaxCliqueSeq algorithm is parallelized by splitting the branch-and-bound search tree to multiple cores, resulting in MaxCliquePara algorithm. The ability to exploit all cores efficiently makes the new parallel MaxCliquePara algorithm markedly superior to other tested algorithms. On a 12-core computer, the parallelization provides up to 2 orders of magnitude faster execution on the large DIMACS benchmark graphs and up to an order of magnitude faster execution on protein product graphs. The algorithms are freely accessible on http://commsys.ijs.si/~matjaz/maxclique.
本文提出了一种新的精确并行最大团算法 MaxCliquePara,用于在无向一般图和蛋白质图中寻找最大团(完全连通子图)。首先,实现了一种新的基于单计算机核心上最大团的分支定界算法,该算法建立在两个已发表的最先进的序列算法中提出的思想之上。新的序列 MaxCliqueSeq 算法在 DIMACS 基准图以及用于蛋白质结构比较的蛋白质衍生产物图上都比参考算法更快。接下来,通过将分支定界搜索树划分为多个核心,对 MaxCliqueSeq 算法进行并行化,得到 MaxCliquePara 算法。新的并行 MaxCliquePara 算法能够有效地利用所有核心,因此明显优于其他测试算法。在 12 核计算机上,并行化在大型 DIMACS 基准图上提供了高达 2 个数量级的更快执行速度,在蛋白质产物图上提供了高达 1 个数量级的更快执行速度。这些算法可在 http://commsys.ijs.si/~matjaz/maxclique 上免费获得。