IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1807-1816. doi: 10.1109/TCBB.2020.3037090. Epub 2022 Jun 3.
We present PALLAS, a practical method for gene regulatory network (GRN) inference from time series data, which employs penalized maximum likelihood and particle swarms for optimization. PALLAS is based on the Partially-Observable Boolean Dynamical System (POBDS) model and thus does not require ad-hoc binarization of the data. The penalty in the likelihood is a LASSO regularization term, which encourages the resulting network to be sparse. PALLAS is able to scale to networks of realistic size under no prior knowledge, by virtue of a novel continuous-discrete Fish School Search particle swarm algorithm for efficient simultaneous maximization of the penalized likelihood over the discrete space of networks and the continuous space of observational parameters. The performance of PALLAS is demonstrated by a comprehensive set of experiments using synthetic data generated from real and artificial networks, as well as real time series microarray and RNA-seq data, where it is compared to several other well-known methods for gene regulatory network inference. The results show that PALLAS can infer GRNs more accurately than other methods, while being capable of working directly on gene expression data, without need of ad-hoc binarization. PALLAS is a fully-fledged program, written in python, and available on GitHub (https://github.com/yukuntan92/PALLAS).
我们提出了 PALLAS,这是一种从时间序列数据中推断基因调控网络(GRN)的实用方法,它采用惩罚最大似然和粒子群优化。PALLAS 基于部分可观察布尔动态系统(POBDS)模型,因此不需要对数据进行特殊的二值化。似然中的惩罚是一个 LASSO 正则化项,鼓励得到的网络稀疏。由于采用了一种新颖的连续离散鱼群搜索粒子群算法,可以在没有先验知识的情况下,有效地同时最大化网络离散空间和观测参数连续空间上的惩罚似然,因此 PALLAS 能够扩展到具有实际规模的网络。通过使用从真实和人工网络生成的合成数据以及实时微阵列和 RNA-seq 数据进行的一系列综合实验,证明了 PALLAS 的性能,将其与其他几种用于基因调控网络推断的知名方法进行了比较。结果表明,PALLAS 可以比其他方法更准确地推断 GRN,同时能够直接处理基因表达数据,而无需特殊的二值化。PALLAS 是一个完整的程序,用 Python 编写,并可在 GitHub(https://github.com/yukuntan92/PALLAS)上获得。