Suppr超能文献

概率平行因子分析2

Probabilistic PARAFAC2.

作者信息

Jørgensen Philip J H, Nielsen Søren F, Hinrich Jesper L, Schmidt Mikkel N, Madsen Kristoffer H, Mørup Morten

机构信息

Department of Applied Mathematics and Computer Science, Technical University of Denmark, 2800 Kongens Lyngby, Denmark.

Danish Research Centre for Magnetic Resonance, Centre for Functional and Diagnostic Imaging and Research, Copenhagen University Hospital Amager and Hvidovre, 2650 Hvidovre, Denmark.

出版信息

Entropy (Basel). 2024 Aug 17;26(8):697. doi: 10.3390/e26080697.

Abstract

The Parallel Factor Analysis 2 (PARAFAC2) is a multimodal factor analysis model suitable for analyzing multi-way data when one of the modes has incomparable observation units, for example, because of differences in signal sampling or batch sizes. A fully probabilistic treatment of the PARAFAC2 is desirable to improve robustness to noise and provide a principled approach for determining the number of factors, but challenging because direct model fitting requires that factor loadings be decomposed into a shared matrix specifying how the components are consistently co-expressed across samples and sample-specific orthogonality-constrained component profiles. We develop two probabilistic formulations of the PARAFAC2 model along with variational Bayesian procedures for inference: In the first approach, the mean values of the factor loadings are orthogonal leading to closed form variational updates, and in the second, the factor loadings themselves are orthogonal using a matrix Von Mises-Fisher distribution. We contrast our probabilistic formulations to the conventional direct fitting algorithm based on maximum likelihood on synthetic data and real fluorescence spectroscopy and gas chromatography-mass spectrometry data showing that the probabilistic formulations are more robust to noise and model order misspecification. The probabilistic PARAFAC2, thus, forms a promising framework for modeling multi-way data accounting for uncertainty.

摘要

平行因子分析2(PARAFAC2)是一种多模态因子分析模型,适用于在其中一个模态具有不可比观测单元的情况下分析多向数据,例如,由于信号采样或批次大小的差异。对PARAFAC2进行完全概率处理有助于提高对噪声的鲁棒性,并为确定因子数量提供一种有原则的方法,但具有挑战性,因为直接模型拟合要求将因子载荷分解为一个共享矩阵,该矩阵指定了各成分在不同样本中如何一致地共同表达以及样本特定的正交约束成分轮廓。我们开发了PARAFAC2模型的两种概率公式以及用于推理的变分贝叶斯程序:在第一种方法中,因子载荷的均值是正交的,从而导致封闭形式的变分更新;在第二种方法中,因子载荷本身使用矩阵冯·米塞斯-费希尔分布是正交的。我们将我们的概率公式与基于合成数据以及实际荧光光谱和气相色谱-质谱数据的最大似然法的传统直接拟合算法进行对比,结果表明概率公式对噪声和模型阶次误判更具鲁棒性。因此,概率PARAFAC2为考虑不确定性的多向数据建模形成了一个有前景的框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fcb/11354162/5041a1ce80e9/entropy-26-00697-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验