NIMEFI：使用多种集成特征重要性算法进行基因调控网络推断

NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

作者信息

Ruyssinck Joeri, Huynh-Thu Vân Anh, Geurts Pierre, Dhaene Tom, Demeester Piet, Saeys Yvan

机构信息

Department of Information Technology, Ghent University - iMinds, Gent, Belgium.

Department of Electrical Engineering and Computer Science & GIGA-R, Systems and Modeling, University of Liège, Liège, Belgium; School of Informatics, University of Edinburgh, Edinburgh, United Kingdom.

出版信息

PLoS One. 2014 Mar 25;9(3):e92709. doi: 10.1371/journal.pone.0092709. eCollection 2014.

DOI:10.1371/journal.pone.0092709

PMID:24667482

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3965471/

Abstract

One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.

摘要

计算系统生物学中长期存在的一个公开挑战是从高通量组学数据推断基因调控网络的拓扑结构。最近，已经开展了两项全社区范围的工作，即DREAM4和DREAM5，以使用基因表达测量对网络推断技术进行基准测试。在这些挑战中，总体表现最佳的是GENIE3算法。该方法将网络推断任务分解为针对网络中每个基因的单独回归问题，其中使用所有其他基因作为可能的预测因子来预测特定目标基因的表达值。接下来，使用基于树的集成方法，计算每个预测基因相对于目标基因的重要性度量，并且高特征重要性被视为两个基因之间存在调控联系的推定证据。这项工作的贡献有两个方面。首先，我们将GENIE3的回归分解策略推广到其他特征重要性方法。我们将支持向量回归、弹性网络、随机森林回归、符号回归及其集成变体在这种情况下的性能与原始GENIE3算法进行比较。为了创建集成变体，我们提出了一种子采样方法，该方法允许我们将任何产生特征排名的特征选择算法转换为集成特征重要性算法。我们证明集成设置是网络推断任务的关键，因为只有集成变体才能实现最佳性能。作为第二个贡献，我们探索了使用多种集成算法的按排名平均预测而不是仅使用一种预测的效果。我们将这种方法命名为NIMEFI（使用多种集成特征重要性算法的网络推断），并表明这种方法通常优于所有单独的方法，尽管在特定网络上单个方法可能表现更好。NIMEFI的一个实现已公开可用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2fbd/3965471/248160191012/pone.0092709.g001.jpg

相似文献

NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.NIMEFI：使用多种集成特征重要性算法进行基因调控网络推断

PLoS One. 2014 Mar 25;9(3):e92709. doi: 10.1371/journal.pone.0092709. eCollection 2014.

Inferring regulatory networks from expression data using tree-based methods.基于树的方法从表达数据推断调控网络。

PLoS One. 2010 Sep 28;5(9):e12776. doi: 10.1371/journal.pone.0012776.

Netter: re-ranking gene network inference predictions using structural network properties.内特尔：利用结构网络属性重新排列基因网络推理预测结果。

BMC Bioinformatics. 2016 Feb 9;17:76. doi: 10.1186/s12859-016-0913-0.

Integrative random forest for gene regulatory network inference.用于基因调控网络推断的集成随机森林

Bioinformatics. 2015 Jun 15;31(12):i197-205. doi: 10.1093/bioinformatics/btv268.

Large scale gene regulatory network inference with a multi-level strategy.基于多层次策略的大规模基因调控网络推断

Mol Biosyst. 2016 Feb;12(2):588-97. doi: 10.1039/c5mb00560d.

dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data.dynGENIE3：用于从时间序列表达数据中推断基因网络的动态 GENIE3。

Sci Rep. 2018 Feb 21;8(1):3384. doi: 10.1038/s41598-018-21715-0.

Gene regulatory network inference using PLS-based methods.使用基于偏最小二乘法的方法进行基因调控网络推断。

BMC Bioinformatics. 2016 Dec 28;17(1):545. doi: 10.1186/s12859-016-1398-6.

AGRN: accurate gene regulatory network inference using ensemble machine learning methods.AGRN：使用集成机器学习方法进行准确的基因调控网络推断

Bioinform Adv. 2023 Apr 5;3(1):vbad032. doi: 10.1093/bioadv/vbad032. eCollection 2023.

A Novel Model Integration Network Inference Algorithm with Clustering and Hub Genes Finding.一种具有聚类和枢纽基因发现功能的新型模型整合网络推断算法。

Mol Inform. 2020 May;39(5):e1900075. doi: 10.1002/minf.201900075. Epub 2020 Jan 28.

Network inference with ensembles of bi-clustering trees.基于二部聚类树集成的网络推断。

BMC Bioinformatics. 2019 Oct 28;20(1):525. doi: 10.1186/s12859-019-3104-y.

引用本文的文献

Single-cell and spatial multiomic inference of gene regulatory networks using SCRIPro.使用SCRIPro进行基因调控网络的单细胞和空间多组学推断

Bioinformatics. 2024 Jul 18;40(7). doi: 10.1093/bioinformatics/btae466.

Gene regulatory network analysis identifies MYL1, MDH2, GLS, and TRIM28 as the principal proteins in the response of mesenchymal stem cells to Mg ions.基因调控网络分析确定MYL1、MDH2、GLS和TRIM28为间充质干细胞对镁离子反应中的主要蛋白质。

Comput Struct Biotechnol J. 2024 Apr 14;23:1773-1785. doi: 10.1016/j.csbj.2024.04.033. eCollection 2024 Dec.

SPREd: a simulation-supervised neural network tool for gene regulatory network reconstruction.SPREd：一种用于基因调控网络重建的模拟监督神经网络工具。

Bioinform Adv. 2024 Jan 23;4(1):vbae011. doi: 10.1093/bioadv/vbae011. eCollection 2024.

SPREd: A simulation-supervised neural network tool for gene regulatory network reconstruction.SPREd：一种用于基因调控网络重建的模拟监督神经网络工具。

bioRxiv. 2023 Nov 13:2023.11.09.566399. doi: 10.1101/2023.11.09.566399.

Seiðr: Efficient calculation of robust ensemble gene networks.Seiðr：稳健整体基因网络的高效计算。

Heliyon. 2023 May 31;9(6):e16811. doi: 10.1016/j.heliyon.2023.e16811. eCollection 2023 Jun.

AGRN: accurate gene regulatory network inference using ensemble machine learning methods.AGRN：使用集成机器学习方法进行准确的基因调控网络推断

Bioinform Adv. 2023 Apr 5;3(1):vbad032. doi: 10.1093/bioadv/vbad032. eCollection 2023.

Phosphoproteomics data-driven signalling network inference: Does it work?磷酸化蛋白质组学数据驱动的信号网络推断：它可行吗？

Comput Struct Biotechnol J. 2022 Dec 15;21:432-443. doi: 10.1016/j.csbj.2022.12.010. eCollection 2023.

Fast and accurate inference of gene regulatory networks through robust precision matrix estimation.通过稳健的精度矩阵估计实现基因调控网络的快速准确推断。

Bioinformatics. 2022 May 13;38(10):2802-2809. doi: 10.1093/bioinformatics/btac178.

gpuZoo: Cost-effective estimation of gene regulatory networks using the Graphics Processing Unit.gpuZoo：使用图形处理器对基因调控网络进行经济高效的估计。

NAR Genom Bioinform. 2022 Feb 8;4(1):lqac002. doi: 10.1093/nargab/lqac002. eCollection 2022 Mar.

FINET: Fast Inferring NETwork.FINET：快速推理网络。

BMC Res Notes. 2020 Nov 10;13(1):521. doi: 10.1186/s13104-020-05371-0.

本文引用的文献

TIGRESS: Trustful Inference of Gene REgulation using Stability Selection.TIGRESS：利用稳定性选择进行基因调控的可信推断

BMC Syst Biol. 2012 Nov 22;6:145. doi: 10.1186/1752-0509-6-145.

Wisdom of crowds for robust gene network inference.群体智慧在稳健基因网络推断中的应用。

Nat Methods. 2012 Jul 15;9(8):796-804. doi: 10.1038/nmeth.2016.

Gene regulatory network inference: evaluation and application to ovarian cancer allows the prioritization of drug targets.基因调控网络推断：在卵巢癌中的评估和应用使得药物靶点的优先级排序成为可能。

Genome Med. 2012 May 1;4(5):41. doi: 10.1186/gm340.

Bagging statistical network inference from large-scale gene expression data.从大规模基因表达数据中进行统计网络推断的装袋方法。

PLoS One. 2012;7(3):e33624. doi: 10.1371/journal.pone.0033624. Epub 2012 Mar 30.

Inferring gene regulatory networks by ANOVA.通过方差分析推断基因调控网络。

Bioinformatics. 2012 May 15;28(10):1376-82. doi: 10.1093/bioinformatics/bts143. Epub 2012 Mar 30.

GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods.GeneNetWeaver：网络推理方法的计算机基准生成和性能分析。

Bioinformatics. 2011 Aug 15;27(16):2263-70. doi: 10.1093/bioinformatics/btr373. Epub 2011 Jun 22.

A comprehensive assessment of methods for de-novo reverse-engineering of genome-scale regulatory networks.基于从头开始反向工程的基因组规模调控网络方法的综合评估。

Genomics. 2011 Jan;97(1):7-18. doi: 10.1016/j.ygeno.2010.10.003. Epub 2010 Oct 14.

Inferring regulatory networks from expression data using tree-based methods.基于树的方法从表达数据推断调控网络。

PLoS One. 2010 Sep 28;5(9):e12776. doi: 10.1371/journal.pone.0012776.

Inferring the conservative causal core of gene regulatory networks.推断基因调控网络的保守因果核心。

BMC Syst Biol. 2010 Sep 28;4:132. doi: 10.1186/1752-0509-4-132.

Regularization Paths for Generalized Linear Models via Coordinate Descent.基于坐标下降法的广义线性模型正则化路径

J Stat Softw. 2010;33(1):1-22.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

NIMEFI：使用多种集成特征重要性算法进行基因调控网络推断

NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献