Solar Roberto, Sepulveda Victor, Inostrosa-Psijas Alonso, Rojas Oscar, Gil-Costa Veronica, Marin Mauricio
1 Departameto de Ingenieria Informática, Centro de Innovación en Tecnologias de la Información para Aplicaciones Sociales, University of Santiago, Santiago de Chile, Chile.
2 Centre for Biotechnology and Bioengineering, Departameto de Ingenieria Informática, University of Santiago, Santiago de Chile, Chile.
J Comput Biol. 2019 Mar;26(3):266-279. doi: 10.1089/cmb.2018.0217. Epub 2019 Jan 9.
Approximate Bayesian computation (ABC) is a useful technique developed for solving Bayesian inference without explicitly requiring a likelihood function. In population genetics, it is widely used to extract part of the information about the evolutionary history of genetic data. The ABC compares the summary statistics computed on simulated and observed data sets. Typically, a forward-in-time approach is used to simulate the genetic material of a population starting from an initial ancestral population and following the evolution of the individuals by advancing generation by generation under various demographic and genetic forces. This approach is computationally expensive and requires a large number of computations making the use of high-performance computing crucial for decreasing the overall response times. In this work, we propose a fully distributed web service-oriented platform for ABC that is based on forward-in-time simulations. Our proposal is based on a client-server approach. The client enables users to define simulation scenarios. The server enables efficient and scalable population simulations and can be deployed on a distributed cluster of processors or even in the cloud. It is composed of four services: a workload generator, a simulation controller, a simulation results analyzer, and a result builder. The server performs multithread simulations by executing a simulation kernel encapsulated in a proposed libgdrift library. We present and evaluate three different libgdrift library approaches whose algorithms aim to reduce execution times and memory consumption.
近似贝叶斯计算(ABC)是一种为解决贝叶斯推理而开发的有用技术,它无需明确要求似然函数。在群体遗传学中,它被广泛用于提取有关遗传数据进化历史的部分信息。ABC比较在模拟数据集和观测数据集上计算的汇总统计量。通常,采用正向时间方法来模拟群体的遗传物质,从初始祖先群体开始,并在各种人口统计学和遗传力作用下逐代推进个体的进化。这种方法计算成本高昂,需要大量计算,因此使用高性能计算对于减少总体响应时间至关重要。在这项工作中,我们提出了一个基于正向时间模拟的面向完全分布式网络服务的ABC平台。我们的提议基于客户端-服务器方法。客户端使用户能够定义模拟场景。服务器实现高效且可扩展的群体模拟,并且可以部署在分布式处理器集群甚至云端。它由四个服务组成:工作负载生成器、模拟控制器、模拟结果分析器和结果构建器。服务器通过执行封装在所提出的libgdrift库中的模拟内核来执行多线程模拟。我们展示并评估了三种不同的libgdrift库方法,其算法旨在减少执行时间和内存消耗。