Mathematical and Statistical Methods Group - Biometris, Wageningen University and Research, Wageningen, The Netherlands.
Laboratory of Plant Physiology, Wageningen University and Research, Wageningen, The Netherlands.
BMC Bioinformatics. 2024 May 30;25(1):202. doi: 10.1186/s12859-024-05778-7.
In systems biology, an organism is viewed as a system of interconnected molecular entities. To understand the functioning of organisms it is essential to integrate information about the variations in the concentrations of those molecular entities. This information can be structured as a set of networks with interconnections and with some hierarchical relations between them. Few methods exist for the reconstruction of integrative networks.
In this work, we propose an integrative network reconstruction method in which the network organization for a particular type of omics data is guided by the network structure of a related type of omics data upstream in the omic cascade. The structure of these guiding data can be either already known or be estimated from the guiding data themselves.
The method consists of three steps. First a network structure for the guiding data should be provided. Next, responses in the target set are regressed on the full set of predictors in the guiding data with a Lasso penalty to reduce the number of predictors and an L2 penalty on the differences between coefficients for predictors that share edges in the network for the guiding data. Finally, a network is reconstructed on the fitted target responses as functions of the predictors in the guiding data. This way we condition the target network on the network of the guiding data.
We illustrate our approach on two examples in Arabidopsis. The method detects groups of metabolites that have a similar genetic or transcriptomic basis.
在系统生物学中,生物体被视为相互关联的分子实体系统。为了理解生物体的功能,必须整合关于这些分子实体浓度变化的信息。这些信息可以组织成一组具有相互连接和层次关系的网络。目前几乎没有方法可以重建综合网络。
在这项工作中,我们提出了一种综合网络重建方法,其中特定类型的组学数据的网络组织由组学级联上游相关类型的组学数据的网络结构指导。这些引导数据的结构可以是已知的,也可以从引导数据本身估计。
该方法由三个步骤组成。首先,应提供指导数据的网络结构。接下来,用 Lasso 惩罚回归目标集中的响应来减少预测因子的数量,用 L2 惩罚引导数据中共享网络边缘的预测因子的系数之间的差异,将目标集回归到指导数据中的全数据集上。最后,根据指导数据中的预测因子在拟合的目标响应上重建网络。这样,我们就可以根据指导数据的网络对目标网络进行条件处理。
我们在拟南芥的两个例子中说明了我们的方法。该方法检测具有相似遗传或转录组基础的代谢物组。