García-Calvo Raúl, Guisado J L, Diaz-Del-Rio Fernando, Córdoba Antonio, Jiménez-Morales Francisco
Department of Computer Architecture and Technology, University of Seville, Seville, Spain.
Department of Condensed Matter Physics, University of Seville, Seville, Spain.
Evol Bioinform Online. 2018 Apr 10;14:1176934318767889. doi: 10.1177/1176934318767889. eCollection 2018.
Understanding the regulation of gene expression is one of the key problems in current biology. A promising method for that purpose is the determination of the temporal dynamics between known initial and ending network states, by using simple acting rules. The huge amount of rule combinations and the nonlinear inherent nature of the problem make genetic algorithms an excellent candidate for finding optimal solutions. As this is a computationally intensive problem that needs long runtimes in conventional architectures for realistic network sizes, it is fundamental to accelerate this task. In this article, we study how to develop efficient parallel implementations of this method for the fine-grained parallel architecture of graphics processing units (GPUs) using the compute unified device architecture (CUDA) platform. An exhaustive and methodical study of various parallel genetic algorithm schemes-master-slave, island, cellular, and hybrid models, and various individual selection methods (roulette, elitist)-is carried out for this problem. Several procedures that optimize the use of the GPU's resources are presented. We conclude that the implementation that produces better results (both from the performance and the genetic algorithm fitness perspectives) is simulating a few thousands of individuals grouped in a few islands using elitist selection. This model comprises 2 mighty factors for discovering the best solutions: finding good individuals in a short number of generations, and introducing genetic diversity via a relatively frequent and numerous migration. As a result, we have even found the optimal solution for the analyzed gene regulatory network (GRN). In addition, a comparative study of the performance obtained by the different parallel implementations on GPU versus a sequential application on CPU is carried out. In our tests, a multifold speedup was obtained for our optimized parallel implementation of the method on medium class GPU over an equivalent sequential single-core implementation running on a recent Intel i7 CPU. This work can provide useful guidance to researchers in biology, medicine, or bioinformatics in how to take advantage of the parallelization on massively parallel devices and GPUs to apply novel metaheuristic algorithms powered by nature for real-world applications (like the method to solve the temporal dynamics of GRNs).
理解基因表达调控是当前生物学的关键问题之一。为此,一种很有前景的方法是通过使用简单的作用规则来确定已知初始和结束网络状态之间的时间动态。大量的规则组合以及问题固有的非线性性质使得遗传算法成为寻找最优解的理想选择。由于这是一个计算密集型问题,对于实际网络规模而言,在传统架构中需要很长的运行时间,因此加速这项任务至关重要。在本文中,我们研究如何使用计算统一设备架构(CUDA)平台为图形处理单元(GPU)的细粒度并行架构开发该方法的高效并行实现。针对此问题,对各种并行遗传算法方案——主从、岛屿、细胞和混合模型,以及各种个体选择方法(轮盘赌、精英主义)进行了详尽且系统的研究。提出了几种优化GPU资源使用的程序。我们得出结论,从性能和遗传算法适应度角度来看,产生更好结果的实现方式是使用精英主义选择,模拟数千个个体并将其分组到几个岛屿中。该模型包含发现最佳解决方案的两个强大因素:在短代内找到优秀个体,以及通过相对频繁且大量的迁移引入遗传多样性。结果,我们甚至找到了所分析基因调控网络(GRN)的最优解。此外,还对GPU上不同并行实现与CPU上的顺序应用所获得的性能进行了比较研究。在我们的测试中,与在最新的英特尔i7 CPU上运行的等效顺序单核实现相比,我们在中等档次GPU上对该方法的优化并行实现获得了数倍的加速。这项工作可以为生物学、医学或生物信息学领域的研究人员提供有用的指导,帮助他们利用大规模并行设备和GPU上的并行化,将受自然启发的新型元启发式算法应用于实际应用(如解决GRN时间动态的方法)。