基于集成估计的多模型与网络推理：避免群体的疯狂。

Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds.

作者信息

Stumpf Michael P H

机构信息

School of BioSciences and School of Mathematics and Statistics, University of Melbourne, Parkville, VIC 3010, Australia.

Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK.

出版信息

J R Soc Interface. 2020 Oct;17(171):20200419. doi: 10.1098/rsif.2020.0419. Epub 2020 Oct 21.

DOI:10.1098/rsif.2020.0419

PMID:33081645

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7653378/

Abstract

Recent progress in theoretical systems biology, applied mathematics and computational statistics allows us to compare the performance of different candidate models at describing a particular biological system quantitatively. Model selection has been applied with great success to problems where a small number-typically less than 10-of models are compared, but recent studies have started to consider thousands and even millions of candidate models. Often, however, we are left with sets of models that are compatible with the data, and then we can use ensembles of models to make predictions. These ensembles can have very desirable characteristics, but as I show here are not guaranteed to improve on individual estimators or predictors. I will show in the cases of model selection and network inference when we can trust ensembles, and when we should be cautious. The analyses suggest that the careful construction of an ensemble-choosing good predictors-is of paramount importance, more than had perhaps been realized before: merely adding different methods does not suffice. The success of ensemble network inference methods is also shown to rest on their ability to suppress false-positive results. A Jupyter notebook which allows carrying out an assessment of ensemble estimators is provided.

摘要

理论系统生物学、应用数学和计算统计学的最新进展使我们能够定量比较不同候选模型在描述特定生物系统方面的性能。模型选择已成功应用于比较少量（通常少于10个）模型的问题，但最近的研究已开始考虑数千甚至数百万个候选模型。然而，我们常常会得到与数据兼容的模型集，然后我们可以使用模型集成来进行预测。这些集成可以具有非常理想的特性，但正如我在此所示，它们并不能保证比单个估计器或预测器有所改进。我将在模型选择和网络推断的案例中说明何时我们可以信任集成，以及何时我们应该谨慎。分析表明，精心构建一个集成——选择好的预测器——至关重要，其重要性可能比之前意识到的还要高：仅仅添加不同的方法是不够的。集成网络推断方法的成功还表明取决于它们抑制假阳性结果的能力。本文提供了一个Jupyter笔记本，用于对集成估计器进行评估。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20c3/7653378/213f4b08de2a/rsif20200419-g1.jpg

相似文献

Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds.基于集成估计的多模型与网络推理：避免群体的疯狂。

J R Soc Interface. 2020 Oct;17(171):20200419. doi: 10.1098/rsif.2020.0419. Epub 2020 Oct 21.

Network inference with ensembles of bi-clustering trees.基于二部聚类树集成的网络推断。

BMC Bioinformatics. 2019 Oct 28;20(1):525. doi: 10.1186/s12859-019-3104-y.

Topological sensitivity analysis for systems biology.系统生物学的拓扑敏感性分析

Proc Natl Acad Sci U S A. 2014 Dec 30;111(52):18507-12. doi: 10.1073/pnas.1414026112. Epub 2014 Dec 15.

Centroid estimation in discrete high-dimensional spaces with applications in biology.离散高维空间中的质心估计及其在生物学中的应用

Proc Natl Acad Sci U S A. 2008 Mar 4;105(9):3209-14. doi: 10.1073/pnas.0712329105. Epub 2008 Feb 27.

LEARNING PARSIMONIOUS ENSEMBLES FOR UNBALANCED COMPUTATIONAL GENOMICS PROBLEMS.学习用于不平衡计算基因组学问题的简约集成方法。

Pac Symp Biocomput. 2017;22:288-299. doi: 10.1142/9789813207813_0028.

A consensus approach for estimating the predictive accuracy of dynamic models in biology.一种用于估计生物学中动态模型预测准确性的共识方法。

Comput Methods Programs Biomed. 2015 Apr;119(1):17-28. doi: 10.1016/j.cmpb.2015.02.001. Epub 2015 Feb 11.

Gene expression complex networks: synthesis, identification, and analysis.基因表达复杂网络：合成、识别与分析。

J Comput Biol. 2011 Oct;18(10):1353-67. doi: 10.1089/cmb.2010.0118. Epub 2011 May 6.

Multimodel ensembles improve predictions of crop-environment-management interactions.多模型集成提高了对作物-环境-管理相互作用的预测能力。

Glob Chang Biol. 2018 Nov;24(11):5072-5083. doi: 10.1111/gcb.14411. Epub 2018 Aug 24.

Improving network inference: The impact of false positive and false negative conclusions about the presence or absence of links.改进网络推断：关于存在或不存在链接的假阳性和假阴性结论的影响。

J Neurosci Methods. 2018 Sep 1;307:31-36. doi: 10.1016/j.jneumeth.2018.06.011. Epub 2018 Jun 26.

Forecasting Corn Yield With Machine Learning Ensembles.利用机器学习集成预测玉米产量

Front Plant Sci. 2020 Jul 31;11:1120. doi: 10.3389/fpls.2020.01120. eCollection 2020.

引用本文的文献

Increasing certainty in systems biology models using Bayesian multimodel inference.使用贝叶斯多模型推理提高系统生物学模型的确定性

Nat Commun. 2025 Aug 11;16(1):7416. doi: 10.1038/s41467-025-62415-4.

Unified tumor growth mechanisms from multimodel inference and dataset integration.多模型推断和数据集整合的统一肿瘤生长机制。

PLoS Comput Biol. 2023 Jul 5;19(7):e1011215. doi: 10.1371/journal.pcbi.1011215. eCollection 2023 Jul.

Improving dynamic predictions with ensembles of observable models.通过可观测模型的集合来改进动态预测。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac755.

Gaining confidence in inferred networks.置信度提升方法在推断网络中的应用。

Sci Rep. 2022 Feb 14;12(1):2394. doi: 10.1038/s41598-022-05402-9.

Forecasting cellular states: from descriptive to predictive biology via single-cell multiomics.预测细胞状态：通过单细胞多组学从描述性生物学走向预测性生物学。

Curr Opin Syst Biol. 2021 Jun;26:24-32. doi: 10.1016/j.coisb.2021.03.008. Epub 2021 Apr 3.

Model comparison via simplicial complexes and persistent homology.通过单纯复形和持久同调进行模型比较。

R Soc Open Sci. 2021 Oct 13;8(10):211361. doi: 10.1098/rsos.211361. eCollection 2021 Oct.

Addressing uncertainty in genome-scale metabolic model reconstruction and analysis.解决基因组规模代谢模型重建与分析中的不确定性问题。

Genome Biol. 2021 Feb 18;22(1):64. doi: 10.1186/s13059-021-02289-z.

本文引用的文献

Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data.基于单细胞转录组数据的基因调控网络推断算法的基准测试。

Nat Methods. 2020 Feb;17(2):147-154. doi: 10.1038/s41592-019-0690-6. Epub 2020 Jan 6.

A Comprehensive Network Atlas Reveals That Turing Patterns Are Common but Not Robust.综合网络图谱揭示图灵模式普遍存在但不稳健。

Cell Syst. 2019 Sep 25;9(3):243-257.e4. doi: 10.1016/j.cels.2019.07.007. Epub 2019 Sep 18.

The Design Principles of Biochemical Timers: Circuits that Discriminate between Transient and Sustained Stimulation.生化定时器的设计原则：区分瞬态和持续刺激的电路。

Cell Syst. 2019 Sep 25;9(3):297-308.e2. doi: 10.1016/j.cels.2019.07.008. Epub 2019 Sep 11.

A comparison of single-cell trajectory inference methods.单细胞轨迹推断方法比较。

Nat Biotechnol. 2019 May;37(5):547-554. doi: 10.1038/s41587-019-0071-9. Epub 2019 Apr 1.

Next-Generation Machine Learning for Biological Networks.下一代生物网络机器学习。

Cell. 2018 Jun 14;173(7):1581-1592. doi: 10.1016/j.cell.2018.05.015. Epub 2018 Jun 7.

Mechanistic models versus machine learning, a fight worth fighting for the biological community?机制模型与机器学习，生物学界值得为之奋斗的一场较量？

Biol Lett. 2018 May;14(5). doi: 10.1098/rsbl.2017.0660.

Gene Regulatory Network Inference from Single-Cell Data Using Multivariate Information Measures.基于多元信息测度的单细胞数据基因调控网络推断

Cell Syst. 2017 Sep 27;5(3):251-267.e3. doi: 10.1016/j.cels.2017.08.014.

How to deal with parameters for whole-cell modelling.如何处理全细胞建模的参数。

J R Soc Interface. 2017 Aug;14(133). doi: 10.1098/rsif.2017.0237. Epub 2017 Aug 2.

Biophysically Motivated Regulatory Network Inference: Progress and Prospects.基于生物物理学的调控网络推断：进展与展望

Hum Hered. 2016;81(2):62-77. doi: 10.1159/000446614. Epub 2017 Jan 12.

A computational method for the investigation of multistable systems and its application to genetic switches.一种用于研究多稳态系统的计算方法及其在基因开关中的应用。

BMC Syst Biol. 2016 Dec 7;10(1):130. doi: 10.1186/s12918-016-0375-z.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于集成估计的多模型与网络推理：避免群体的疯狂。

Multi-model and network inference based on ensemble estimates: avoiding the madness of crowds.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献