Suppr超能文献

贝叶斯推断用于单变量和多变量偏斜正态和偏斜 t 分布的有限混合。

Bayesian inference for finite mixtures of univariate and multivariate skew-normal and skew-t distributions.

机构信息

Department of Applied Statistics and Econometrics, Johannes Kepler Universität Linz, Linz, Austria.

出版信息

Biostatistics. 2010 Apr;11(2):317-36. doi: 10.1093/biostatistics/kxp062. Epub 2010 Jan 27.

Abstract

Skew-normal and skew-t distributions have proved to be useful for capturing skewness and kurtosis in data directly without transformation. Recently, finite mixtures of such distributions have been considered as a more general tool for handling heterogeneous data involving asymmetric behaviors across subpopulations. We consider such mixture models for both univariate as well as multivariate data. This allows robust modeling of high-dimensional multimodal and asymmetric data generated by popular biotechnological platforms such as flow cytometry. We develop Bayesian inference based on data augmentation and Markov chain Monte Carlo (MCMC) sampling. In addition to the latent allocations, data augmentation is based on a stochastic representation of the skew-normal distribution in terms of a random-effects model with truncated normal random effects. For finite mixtures of skew normals, this leads to a Gibbs sampling scheme that draws from standard densities only. This MCMC scheme is extended to mixtures of skew-t distributions based on representing the skew-t distribution as a scale mixture of skew normals. As an important application of our new method, we demonstrate how it provides a new computational framework for automated analysis of high-dimensional flow cytometric data. Using multivariate skew-normal and skew-t mixture models, we could model non-Gaussian cell populations rigorously and directly without transformation or projection to lower dimensions.

摘要

偏态正态分布和偏态 t 分布已被证明可用于直接捕获数据中的偏度和峰度,而无需进行转换。最近,这些分布的有限混合已被视为处理涉及亚群之间不对称行为的异构数据的更通用工具。我们考虑了用于单变量和多变量数据的此类混合模型。这允许对通过流式细胞术等流行生物技术平台生成的高维多峰和不对称数据进行稳健建模。我们基于数据增强和马尔可夫链蒙特卡罗(MCMC)采样开发了贝叶斯推断。除了潜在分配外,数据增强还基于随机效应模型中偏态正态分布的随机表示,该模型具有截断正态随机效应。对于偏态正态分布的有限混合,这导致仅从标准密度中抽取的 Gibbs 抽样方案。该 MCMC 方案基于将偏态 t 分布表示为偏态正态分布的尺度混合而扩展到偏态 t 分布的混合。作为我们新方法的一个重要应用,我们展示了它如何为高维流式细胞术数据的自动分析提供新的计算框架。使用多元偏态正态和偏态 t 混合模型,我们可以严格直接地对非高斯细胞群体进行建模,而无需进行转换或投影到较低维度。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验