Suppr超能文献

通过概率潜在语义分析对质谱图像进行简洁表示。

Concise representation of mass spectrometry images by probabilistic latent semantic analysis.

作者信息

Hanselmann Michael, Kirchner Marc, Renard Bernhard Y, Amstalden Erika R, Glunde Kristine, Heeren Ron M A, Hamprecht Fred A

机构信息

Interdisciplinary Center for Scientific Computing (IWR), University of Heidelberg, Speyerer Strasse 4, Heidelberg, Germany.

出版信息

Anal Chem. 2008 Dec 15;80(24):9649-58. doi: 10.1021/ac801303x.

Abstract

Imaging mass spectrometry (IMS) is a promising technology which allows for detailed analysis of spatial distributions of (bio)molecules in organic samples. In many current applications, IMS relies heavily on (semi)automated exploratory data analysis procedures to decompose the data into characteristic component spectra and corresponding abundance maps, visualizing spectral and spatial structure. The most commonly used techniques are principal component analysis (PCA) and independent component analysis (ICA). Both methods operate in an unsupervised manner. However, their decomposition estimates usually feature negative counts and are not amenable to direct physical interpretation. We propose probabilistic latent semantic analysis (pLSA) for non-negative decomposition and the elucidation of interpretable component spectra and abundance maps. We compare this algorithm to PCA, ICA, and non-negative PARAFAC (parallel factors analysis) and show on simulated and real-world data that pLSA and non-negative PARAFAC are superior to PCA or ICA in terms of complementarity of the resulting components and reconstruction accuracy. We further combine pLSA decomposition with a statistical complexity estimation scheme based on the Akaike information criterion (AIC) to automatically estimate the number of components present in a tissue sample data set and show that this results in sensible complexity estimates.

摘要

成像质谱(IMS)是一项很有前景的技术,它能够对有机样品中(生物)分子的空间分布进行详细分析。在许多当前应用中,IMS严重依赖(半)自动化探索性数据分析程序,将数据分解为特征成分谱和相应的丰度图,以可视化光谱和空间结构。最常用的技术是主成分分析(PCA)和独立成分分析(ICA)。这两种方法都是以无监督方式运行。然而,它们的分解估计通常具有负计数,并且不便于直接进行物理解释。我们提出概率潜在语义分析(pLSA)用于非负分解,并阐明可解释的成分谱和丰度图。我们将该算法与PCA、ICA和非负PARAFAC(平行因子分析)进行比较,并在模拟数据和真实数据上表明,就所得成分的互补性和重建准确性而言,pLSA和非负PARAFAC优于PCA或ICA。我们进一步将pLSA分解与基于赤池信息准则(AIC)的统计复杂度估计方案相结合,以自动估计组织样本数据集中存在的成分数量,并表明这会产生合理的复杂度估计。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验