Department of Applied Mathematics and Institute of Statistics, National Chung Hsing University, Taichung 402, Taiwan.
BMC Bioinformatics. 2012 Apr 12;13 Suppl 5(Suppl 5):S5. doi: 10.1186/1471-2105-13-S5-S5.
Gradual or sudden transitions among different states as exhibited by cell populations in a biological sample under particular conditions or stimuli can be detected and profiled by flow cytometric time course data. Often such temporal profiles contain features due to transient states that present unique modeling challenges. These could range from asymmetric non-Gaussian distributions to outliers and tail subpopulations, which need to be modeled with precision and rigor.
To ensure precision and rigor, we propose a parametric modeling framework StateProfiler based on finite mixtures of skew t-Normal distributions that are robust against non-Gaussian features caused by asymmetry and outliers in data. Further, we present in StateProfiler a new greedy EM algorithm for fast and optimal model selection. The parsimonious approach of our greedy algorithm allows us to detect the genuine dynamic variation in the key features as and when they appear in time course data. We also present a procedure to construct a well-fitted profile by merging any redundant model components in a way that minimizes change in entropy of the resulting model. This allows precise profiling of unusually shaped distributions and less well-separated features that may appear due to cellular heterogeneity even within clonal populations.
By modeling flow cytometric data measured over time course and marker space with StateProfiler, specific parametric characteristics of cellular states can be identified. The parameters are then tested statistically for learning global and local patterns of spatio-temporal change. We applied StateProfiler to identify the temporal features of yeast cell cycle progression based on knockout of S-phase triggering cyclins Clb5 and Clb6, and then compared the S-phase delay phenotypes due to differential regulation of the two cyclins. We also used StateProfiler to construct the temporal profile of clonal divergence underlying lineage selection in mammalian hematopoietic progenitor cells.
在特定条件或刺激下,生物样本中的细胞群体表现出的逐渐或突然的状态转变,可以通过流式细胞计时数据来检测和描绘。通常,这些时间进程包含由于瞬态状态而呈现的独特建模挑战的特征。这些特征可能从非对称的非高斯分布到异常值和尾部亚群,需要精确和严格地建模。
为了确保精度和严谨性,我们提出了一种基于倾斜 t-正态混合分布的参数建模框架 StateProfiler,该框架对数据中的非对称和异常值引起的非高斯特征具有鲁棒性。此外,我们在 StateProfiler 中提出了一种新的贪婪 EM 算法,用于快速和最优的模型选择。我们贪婪算法的简约方法允许我们在时间进程数据中及时检测到关键特征的真实动态变化。我们还提出了一种通过以最小化模型熵变化的方式合并任何冗余模型组件的方法来构建拟合良好的轮廓的过程。这允许对由于细胞异质性甚至在克隆群体内出现的异常形状分布和分离程度较差的特征进行精确的分析。
通过使用 StateProfiler 对随时间和标记空间测量的流式细胞术数据进行建模,可以识别细胞状态的特定参数特征。然后,对这些参数进行统计测试,以学习时空变化的全局和局部模式。我们应用 StateProfiler 来识别基于 S 期触发周期蛋白 Clb5 和 Clb6 敲除的酵母细胞周期进程的时间特征,然后比较由于两种周期蛋白的差异调节导致的 S 期延迟表型。我们还使用 StateProfiler 构建了哺乳动物造血祖细胞谱系选择下克隆分歧的时间进程。