Suppr超能文献

用于向量自回归的灵活贝叶斯乘积混合模型

Flexible Bayesian Product Mixture Models for Vector Autoregressions.

作者信息

Kundu Suprateek, Lukemire Joshua

机构信息

Department of Biostatistics, The University of Texas MD Anderson Cancer Center, University of Texas, Houston, TX 77030, USA.

Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322, USA.

出版信息

J Mach Learn Res. 2024 Apr;25.

Abstract

Bayesian non-parametric methods based on Dirichlet process mixtures have seen tremendous success in various domains and are appealing in being able to borrow information by clustering samples that share identical parameters. However, such methods can face hurdles in heterogeneous settings where objects are expected to cluster only along a subset of axes or where clusters of samples share only a subset of identical parameters. We overcome such limitations by developing a novel class of product of Dirichlet process location-scale mixtures that enables independent clustering at multiple scales, which results in varying levels of information sharing across samples. First, we develop the approach for independent multivariate data. Subsequently we generalize it to multivariate time-series data under the framework of multi-subject Vector Autoregressive (VAR) models that is our primary focus, which go beyond parametric single-subject VAR models. We establish posterior consistency and develop efficient posterior computation for implementation. Extensive numerical studies involving VAR models show distinct advantages over competing methods in terms of estimation, clustering, and feature selection accuracy. Our resting state fMRI analysis from the Human Connectome Project reveals biologically interpretable connectivity differences between distinct intelligence groups, while another air pollution application illustrates the superior forecasting accuracy compared to alternate methods.

摘要

基于狄利克雷过程混合的贝叶斯非参数方法在各个领域都取得了巨大成功,并且由于能够通过对共享相同参数的样本进行聚类来借用信息而颇具吸引力。然而,在异质环境中,此类方法可能会遇到障碍,例如对象仅预期沿轴的一个子集进行聚类,或者样本簇仅共享相同参数的一个子集。我们通过开发一类新颖的狄利克雷过程位置 - 尺度混合乘积来克服这些限制,该方法能够在多个尺度上进行独立聚类,从而导致样本间不同程度的信息共享。首先,我们针对独立多元数据开发该方法。随后,我们将其推广到多主体向量自回归(VAR)模型框架下的多元时间序列数据,这是我们的主要关注点,它超越了参数化单主体VAR模型。我们建立了后验一致性,并开发了高效的后验计算方法以进行实现。涉及VAR模型的广泛数值研究表明,在估计、聚类和特征选择准确性方面,与竞争方法相比具有明显优势。我们对人类连接体项目静息态功能磁共振成像的分析揭示了不同智力组之间具有生物学可解释性的连接差异,而另一个空气污染应用则说明了与替代方法相比具有更高的预测准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/59d4/11646655/4965bf7f120d/nihms-2038327-f0006.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验