使用基于过程模型的高效集成对动态系统进行建模。

Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models.

作者信息

Simidjievski Nikola, Todorovski Ljupčo, Džeroski Sašo

机构信息

Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia.

Jožef Stefan International Postgraduate School, Ljubljana, Slovenia.

出版信息

PLoS One. 2016 Apr 14;11(4):e0153507. doi: 10.1371/journal.pone.0153507. eCollection 2016.

DOI:10.1371/journal.pone.0153507

PMID:27078633

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4831761/

Abstract

Ensembles are a well established machine learning paradigm, leading to accurate and robust models, predominantly applied to predictive modeling tasks. Ensemble models comprise a finite set of diverse predictive models whose combined output is expected to yield an improved predictive performance as compared to an individual model. In this paper, we propose a new method for learning ensembles of process-based models of dynamic systems. The process-based modeling paradigm employs domain-specific knowledge to automatically learn models of dynamic systems from time-series observational data. Previous work has shown that ensembles based on sampling observational data (i.e., bagging and boosting), significantly improve predictive performance of process-based models. However, this improvement comes at the cost of a substantial increase of the computational time needed for learning. To address this problem, the paper proposes a method that aims at efficiently learning ensembles of process-based models, while maintaining their accurate long-term predictive performance. This is achieved by constructing ensembles with sampling domain-specific knowledge instead of sampling data. We apply the proposed method to and evaluate its performance on a set of problems of automated predictive modeling in three lake ecosystems using a library of process-based knowledge for modeling population dynamics. The experimental results identify the optimal design decisions regarding the learning algorithm. The results also show that the proposed ensembles yield significantly more accurate predictions of population dynamics as compared to individual process-based models. Finally, while their predictive performance is comparable to the one of ensembles obtained with the state-of-the-art methods of bagging and boosting, they are substantially more efficient.

摘要

集成是一种成熟的机器学习范式，可生成准确且稳健的模型，主要应用于预测建模任务。集成模型由一组有限的不同预测模型组成，与单个模型相比，其组合输出有望产生更好的预测性能。在本文中，我们提出了一种学习动态系统基于过程模型集成的新方法。基于过程的建模范式利用特定领域知识从时间序列观测数据中自动学习动态系统模型。先前的工作表明，基于对观测数据进行采样（即装袋法和提升法）的集成显著提高了基于过程模型的预测性能。然而，这种改进是以学习所需计算时间大幅增加为代价的。为了解决这个问题，本文提出了一种方法，旨在高效地学习基于过程模型的集成，同时保持其准确的长期预测性能。这是通过用特定领域知识采样而不是数据采样来构建集成实现的。我们将所提出的方法应用于三个湖泊生态系统中一组自动预测建模问题，并使用基于过程的知识库对种群动态进行建模来评估其性能。实验结果确定了关于学习算法的最优设计决策。结果还表明，与单个基于过程的模型相比，所提出的集成对种群动态的预测要准确得多。最后，虽然它们的预测性能与使用装袋法和提升法等最先进方法获得的集成相当，但它们的效率要高得多。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cad1/4831761/b4bdbe3fbfde/pone.0153507.g001.jpg

相似文献

Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models.使用基于过程模型的高效集成对动态系统进行建模。

PLoS One. 2016 Apr 14;11(4):e0153507. doi: 10.1371/journal.pone.0153507. eCollection 2016.

LEARNING PARSIMONIOUS ENSEMBLES FOR UNBALANCED COMPUTATIONAL GENOMICS PROBLEMS.学习用于不平衡计算基因组学问题的简约集成方法。

Pac Symp Biocomput. 2017;22:288-299. doi: 10.1142/9789813207813_0028.

Bagging and boosting negatively correlated neural networks.装袋法和提升法与神经网络呈负相关。

IEEE Trans Syst Man Cybern B Cybern. 2008 Jun;38(3):771-84. doi: 10.1109/TSMCB.2008.922055.

Margin-Based Pareto Ensemble Pruning: An Ensemble Pruning Algorithm That Learns to Search Optimized Ensembles.基于边缘的帕累托集成剪枝：一种学习搜索优化集成的集成剪枝算法。

Comput Intell Neurosci. 2019 Jun 3;2019:7560872. doi: 10.1155/2019/7560872. eCollection 2019.

Ensembles of randomized trees using diverse distributed representations of clinical events.使用临床事件的多种分布式表示的随机树集成。

BMC Med Inform Decis Mak. 2016 Jul 21;16 Suppl 2(Suppl 2):69. doi: 10.1186/s12911-016-0309-0.

SVM and SVM Ensembles in Breast Cancer Prediction.支持向量机及其集成方法在乳腺癌预测中的应用

PLoS One. 2017 Jan 6;12(1):e0161501. doi: 10.1371/journal.pone.0161501. eCollection 2017.

Drug-target interaction prediction with tree-ensemble learning and output space reconstruction.基于树集成学习和输出空间重构的药物-靶标相互作用预测。

BMC Bioinformatics. 2020 Feb 7;21(1):49. doi: 10.1186/s12859-020-3379-z.

Non-invasive real-time prediction of inner knee temperatures during therapeutic cooling.治疗性冷却过程中膝关节内部温度的无创实时预测。

Comput Methods Programs Biomed. 2015 Nov;122(2):136-48. doi: 10.1016/j.cmpb.2015.07.004. Epub 2015 Jul 17.

Forecasting Corn Yield With Machine Learning Ensembles.利用机器学习集成预测玉米产量

Front Plant Sci. 2020 Jul 31;11:1120. doi: 10.3389/fpls.2020.01120. eCollection 2020.

Predicting protein function and other biomedical characteristics with heterogeneous ensembles.利用异构集成预测蛋白质功能和其他生物医学特征。

Methods. 2016 Jan 15;93:92-102. doi: 10.1016/j.ymeth.2015.08.016. Epub 2015 Sep 2.

引用本文的文献

A machine learning technique for identifying DNA enhancer regions utilizing CIS-regulatory element patterns.一种利用 CIS 调控元件模式识别 DNA 增强子区域的机器学习技术。

Sci Rep. 2022 Sep 7;12(1):15183. doi: 10.1038/s41598-022-19099-3.

DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features.DNAPred_Prot：利用基于组成和位置的特征识别DNA结合蛋白。

Appl Bionics Biomech. 2022 Apr 13;2022:5483115. doi: 10.1155/2022/5483115. eCollection 2022.

A Bayesian machine scientist to aid in the solution of challenging scientific problems.一位贝叶斯机器科学家，以协助解决具有挑战性的科学问题。

Sci Adv. 2020 Jan 31;6(5):eaav6971. doi: 10.1126/sciadv.aav6971. eCollection 2020 Jan.

A practical method for estimating coupling functions in complex dynamical systems.一种用于估计复杂动力系统中耦合函数的实用方法。

Philos Trans A Math Phys Eng Sci. 2019 Dec 16;377(2160):20190015. doi: 10.1098/rsta.2019.0015. Epub 2019 Oct 28.

Gene-Metabolite Interaction in the One Carbon Metabolism Pathway: Predictors of Colorectal Cancer in Multi-Ethnic Families.一碳代谢途径中的基因-代谢物相互作用：多民族家庭中结直肠癌的预测指标

J Pers Med. 2018 Aug 6;8(3):26. doi: 10.3390/jpm8030026.

Gene-environment interactions and predictors of breast cancer in family-based multi-ethnic groups.基于家庭的多民族群体中基因与环境的相互作用及乳腺癌的预测因素

Oncotarget. 2018 Jun 26;9(49):29019-29035. doi: 10.18632/oncotarget.25520.

Personalized Nutrition-Genes, Diet, and Related Interactive Parameters as Predictors of Cancer in Multiethnic Colorectal Cancer Families.个性化营养-基因、饮食和相关交互参数作为多种族结直肠癌家族癌症的预测因子。

Nutrients. 2018 Jun 20;10(6):795. doi: 10.3390/nu10060795.

Predictors of the Healthy Eating Index and Glycemic Index in Multi-Ethnic Colorectal Cancer Families.多民族结直肠癌家庭健康饮食指数和血糖指数的预测因素。

Nutrients. 2018 May 26;10(6):674. doi: 10.3390/nu10060674.

Gene Environment Interactions and Predictors of Colorectal Cancer in Family-Based, Multi-Ethnic Groups.基于家庭的多民族群体中基因与环境的相互作用及结直肠癌的预测因素

J Pers Med. 2018 Feb 16;8(1):10. doi: 10.3390/jpm8010010.

Continental-scale, data-driven predictive assessment of eliminating the vector-borne disease, lymphatic filariasis, in sub-Saharan Africa by 2020.到2020年在撒哈拉以南非洲消除媒介传播疾病淋巴丝虫病的大陆规模、数据驱动的预测性评估。

BMC Med. 2017 Sep 27;15(1):176. doi: 10.1186/s12916-017-0933-2.

本文引用的文献

Learning stochastic process-based models of dynamical systems from knowledge and data.从知识和数据中学习基于随机过程的动态系统模型。

BMC Syst Biol. 2016 Mar 22;10:30. doi: 10.1186/s12918-016-0273-4.

iMiRNA-SSF: Improving the Identification of MicroRNA Precursors by Combining Negative Sets with Different Distributions.iMiRNA-SSF：通过结合不同分布的负集改进微小RNA前体的识别

Sci Rep. 2016 Jan 12;6:19062. doi: 10.1038/srep19062.

iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition.iEnhancer-2L：一种通过伪 k-元核苷酸组成识别增强子及其强度的两层预测器。

Bioinformatics. 2016 Feb 1;32(3):362-9. doi: 10.1093/bioinformatics/btv604. Epub 2015 Oct 17.

Domain-specific model selection for structural identification of the Rab5-Rab7 dynamics in endocytosis.用于内吞作用中Rab5-Rab7动力学结构识别的特定领域模型选择。

BMC Syst Biol. 2015 Jun 26;9:31. doi: 10.1186/s12918-015-0175-x.

Teamwork: improved eQTL mapping using combinations of machine learning methods.团队合作：使用机器学习方法组合提高 eQTL 图谱绘制。

PLoS One. 2012;7(7):e40916. doi: 10.1371/journal.pone.0040916. Epub 2012 Jul 24.

Parameter estimation with bio-inspired meta-heuristic optimization: modeling the dynamics of endocytosis.基于生物启发式元启发式优化的参数估计：内吞作用动力学建模

BMC Syst Biol. 2011 Oct 11;5:159. doi: 10.1186/1752-0509-5-159.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用基于过程模型的高效集成对动态系统进行建模。

Modeling Dynamic Systems with Efficient Ensembles of Process-Based Models.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献