Suppr超能文献

一种基于熵的模型选择方法及其在单细胞时间戳快照数据中的应用

An Entropy-Based Approach to Model Selection with Application to Single-Cell Time-Stamped Snapshot Data.

作者信息

Stewart William Cl, Jayaprakash Ciriyam, Das Jayajit

机构信息

GIG Statistical Consulting, LLC.

Ohio State University Department of Physics.

出版信息

bioRxiv. 2025 Jan 4:2025.01.03.631247. doi: 10.1101/2025.01.03.631247.

Abstract

Recent single-cell experiments that measure copy numbers of over 40 proteins in individual cells at different time points [time-stamped snapshot (TSS) data] exhibit cell-to-cell variability. Because the same cells cannot be tracked over time, TSS data provide key information about the time-evolution of protein abundances that could yield mechanisms that underlie signaling kinetics. We recently developed a generalized method of moments (GMM) based approach that estimates parameters of mechanistic models using TSS data. However, when multiple mechanistic models potentially explain the same TSS data, selecting the best model (i.e., model selection) is often challenging. Popular approaches like Kullback-Leibler divergence and Akaike's Information Criterion are difficult to implement because the distribution that gave rise to the "noisy" data is only known numerically and approximately. To perform model selection in this situation, we introduce an entropy-based approach that incorporates our GMM based parameter estimation and commonly used estimators in kernel density estimation. Using simulated TSS data, we show that our approach can select the "ground truth" from a set of competing mechanistic models. Furthermore, we use a bootstrap procedure to compute model selection probabilities, which can be useful when measuring the relative support of a candidate model.

摘要

最近的单细胞实验在不同时间点测量单个细胞中40多种蛋白质的拷贝数[带时间戳的快照(TSS)数据],呈现出细胞间的变异性。由于无法随时间追踪相同的细胞,TSS数据提供了有关蛋白质丰度随时间演变的关键信息,这些信息可能揭示信号转导动力学的潜在机制。我们最近开发了一种基于广义矩法(GMM)的方法,该方法使用TSS数据估计机械模型的参数。然而,当多个机械模型可能解释相同的TSS数据时,选择最佳模型(即模型选择)通常具有挑战性。像Kullback-Leibler散度和赤池信息准则这样的常用方法难以实施,因为产生“噪声”数据的分布仅通过数值近似得知。为了在这种情况下进行模型选择,我们引入了一种基于熵的方法,该方法结合了我们基于GMM的参数估计和核密度估计中常用的估计器。使用模拟的TSS数据,我们表明我们的方法可以从一组相互竞争的机械模型中选择“真实情况”。此外,我们使用自举程序来计算模型选择概率,这在衡量候选模型的相对支持度时可能会很有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/871b/11722210/bb23e3172767/nihpp-2025.01.03.631247v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验