Suppr超能文献

通过应用遗传算法校准基于高维主体的全身炎症模型,利用临床数据的异质性进行模型优化和规则发现。

Utilizing the Heterogeneity of Clinical Data for Model Refinement and Rule Discovery Through the Application of Genetic Algorithms to Calibrate a High-Dimensional Agent-Based Model of Systemic Inflammation.

作者信息

Cockrell Chase, An Gary

机构信息

Departmen of Surgery, Larner College of Medicine, The University of Vermont, Burlington, VT, United States.

出版信息

Front Physiol. 2021 May 19;12:662845. doi: 10.3389/fphys.2021.662845. eCollection 2021.

Abstract

Accounting for biological heterogeneity represents one of the greatest challenges in biomedical research. Dynamic computational and mathematical models can be used to enhance the study and understanding of biological systems, but traditional methods for calibration and validation commonly do not account for the heterogeneity of biological data, which may result in overfitting and brittleness of these models. Herein we propose a machine learning approach that utilizes genetic algorithms (GAs) to calibrate and refine an agent-based model (ABM) of acute systemic inflammation, with a focus on accounting for the heterogeneity seen in a clinical data set, thereby avoiding overfitting and increasing the robustness and potential generalizability of the underlying simulation model. Agent-based modeling is a frequently used modeling method for multi-scale mechanistic modeling. However, the same properties that make ABMs well suited to representing biological systems also present significant challenges with respect to their construction and calibration, particularly with respect to the selection of potential mechanistic rules and the large number of associated free parameters. We have proposed that machine learning approaches (such as GAs) can be used to more effectively and efficiently deal with rule selection and parameter space characterization; the current work applies GAs to the challenge of calibrating a complex ABM to a specific data set, while preserving biological heterogeneity reflected in the range and variance of the data. This project uses a GA to augment the rule-set for a previously validated ABM of acute systemic inflammation, the Innate Immune Response ABM (IIRABM) to clinical time series data of systemic cytokine levels from a population of burn patients. The genome for the GA is a vector generated from the IIRABM's Model Rule Matrix (MRM), which is a matrix representation of not only the constants/parameters associated with the IIRABM's cytokine interaction rules, but also the existence of rules themselves. Capturing heterogeneity is accomplished by a fitness function that incorporates the sample value range ("error bars") of the clinical data. The GA-enabled parameter space exploration resulted in a set of putative MRM rules and associated parameterizations which closely match the cytokine time course data used to design the fitness function. The number of non-zero elements in the MRM increases significantly as the model parameterizations evolve toward a fitness function minimum, transitioning from a sparse to a dense matrix. This results in a model structure that more closely resembles (at a superficial level) the structure of data generated by a standard differential gene expression experimental study. We present an HPC-enabled machine learning/evolutionary computing approach to calibrate a complex ABM to complex clinical data while preserving biological heterogeneity. The integration of machine learning, HPC, and multi-scale mechanistic modeling provides a pathway forward to more effectively representing the heterogeneity of clinical populations and their data.

摘要

考虑生物异质性是生物医学研究中最大的挑战之一。动态计算和数学模型可用于加强对生物系统的研究和理解,但传统的校准和验证方法通常没有考虑生物数据的异质性,这可能导致这些模型的过度拟合和脆弱性。在此,我们提出一种机器学习方法,该方法利用遗传算法(GA)来校准和完善基于主体的急性全身炎症模型(ABM),重点是考虑临床数据集中存在的异质性,从而避免过度拟合并提高基础模拟模型的稳健性和潜在通用性。基于主体的建模是多尺度机制建模中常用的建模方法。然而,使ABM非常适合表示生物系统的相同特性,在其构建和校准方面也带来了重大挑战,特别是在潜在机制规则的选择和大量相关自由参数方面。我们提出机器学习方法(如GA)可用于更有效和高效地处理规则选择和参数空间表征;当前的工作将GA应用于将复杂的ABM校准到特定数据集的挑战,同时保留数据范围和方差中反映的生物异质性。该项目使用GA来扩充先前验证的急性全身炎症ABM(固有免疫反应ABM,IIRABM)的规则集,使其适用于烧伤患者群体全身细胞因子水平的临床时间序列数据。GA的基因组是从IIRABM的模型规则矩阵(MRM)生成的向量,MRM不仅是与IIRABM细胞因子相互作用规则相关的常数/参数的矩阵表示,也是规则本身存在与否的矩阵表示。通过纳入临床数据样本值范围(“误差线”)的适应度函数来实现对异质性的捕捉。启用GA的参数空间探索产生了一组假定的MRM规则和相关参数化,它们与用于设计适应度函数的细胞因子时间进程数据紧密匹配。随着模型参数化朝着适应度函数最小值演变,MRM中非零元素的数量显著增加,从稀疏矩阵转变为密集矩阵。这导致模型结构(在表面层面)更类似于标准差异基因表达实验研究所生成数据的结构。我们提出一种基于高性能计算(HPC)的机器学习/进化计算方法,用于将复杂的ABM校准到复杂的临床数据,同时保留生物异质性。机器学习、HPC和多尺度机制建模的整合为更有效地表示临床群体及其数据的异质性提供了一条前进的道路。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验