• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于记忆优化的主成分分析在大质谱成像数据集降维中的应用。

Memory efficient principal component analysis for the dimensionality reduction of large mass spectrometry imaging data sets.

机构信息

Physical Sciences of Imaging in the Biomedical Sciences Doctoral Training Centre, School of Chemistry, University of Birmingham, Edgbaston, Birmingham, United Kingdom.

出版信息

Anal Chem. 2013 Mar 19;85(6):3071-8. doi: 10.1021/ac302528v. Epub 2013 Mar 6.

DOI:10.1021/ac302528v
PMID:23394348
Abstract

A memory efficient algorithm for the computation of principal component analysis (PCA) of large mass spectrometry imaging data sets is presented. Mass spectrometry imaging (MSI) enables two- and three-dimensional overviews of hundreds of unlabeled molecular species in complex samples such as intact tissue. PCA, in combination with data binning or other reduction algorithms, has been widely used in the unsupervised processing of MSI data and as a dimentionality reduction method prior to clustering and spatial segmentation. Standard implementations of PCA require the data to be stored in random access memory. This imposes an upper limit on the amount of data that can be processed, necessitating a compromise between the number of pixels and the number of peaks to include. With increasing interest in multivariate analysis of large 3D multislice data sets and ongoing improvements in instrumentation, the ability to retain all pixels and many more peaks is increasingly important. We present a new method which has no limitation on the number of pixels and allows an increased number of peaks to be retained. The new technique was validated against the MATLAB (The MathWorks Inc., Natick, Massachusetts) implementation of PCA (princomp) and then used to reduce, without discarding peaks or pixels, multiple serial sections acquired from a single mouse brain which was too large to be analyzed with princomp. Then, k-means clustering was performed on the reduced data set. We further demonstrate with simulated data of 83 slices, comprising 20,535 pixels per slice and equaling 44 GB of data, that the new method can be used in combination with existing tools to process an entire organ. MATLAB code implementing the memory efficient PCA algorithm is provided.

摘要

提出了一种用于计算大型质谱成像数据集主成分分析(PCA)的内存高效算法。质谱成像(MSI)能够在复杂样本(如完整组织)中对数百种未标记的分子物种进行二维和三维概述。PCA 与数据-bin 或其他降维算法相结合,已广泛应用于 MSI 数据的无监督处理,以及在聚类和空间分割之前作为降维方法。PCA 的标准实现要求数据存储在随机存取存储器中。这对可以处理的数据量施加了上限,因此需要在像素数量和要包含的峰数量之间进行折衷。随着对大型 3D 多切片数据集的多元分析的兴趣增加以及仪器的不断改进,保留所有像素和更多峰的能力变得越来越重要。我们提出了一种新方法,该方法对像素数量没有限制,并允许保留更多的峰。新方法通过与 MATLAB(马萨诸塞州纳蒂克的 The MathWorks Inc.)实现的 PCA(princomp)进行验证,然后用于减少单个鼠标大脑的多个连续切片,而无需丢弃峰或像素,该大脑太大而无法用 princomp 进行分析。然后,在缩减的数据集中执行 k-均值聚类。我们进一步用包含 83 个切片、每个切片包含 20,535 个像素、总计 44GB 数据的模拟数据证明,新方法可以与现有工具结合使用来处理整个器官。提供了实现内存高效 PCA 算法的 MATLAB 代码。

相似文献

1
Memory efficient principal component analysis for the dimensionality reduction of large mass spectrometry imaging data sets.基于记忆优化的主成分分析在大质谱成像数据集降维中的应用。
Anal Chem. 2013 Mar 19;85(6):3071-8. doi: 10.1021/ac302528v. Epub 2013 Mar 6.
2
Exploring three-dimensional matrix-assisted laser desorption/ionization imaging mass spectrometry data: three-dimensional spatial segmentation of mouse kidney.探索三维基质辅助激光解吸/电离成像质谱数据:小鼠肾脏的三维空间分割。
Anal Chem. 2012 Jul 17;84(14):6079-87. doi: 10.1021/ac300673y. Epub 2012 Jul 5.
3
Dimensionality reduction and visualization in principal component analysis.主成分分析中的降维和可视化
Anal Chem. 2008 Jul 1;80(13):4933-44. doi: 10.1021/ac800110w. Epub 2008 Jun 7.
4
Spatial and spectral correlations in MALDI mass spectrometry images by clustering and multivariate analysis.通过聚类和多变量分析实现基质辅助激光解吸电离质谱成像中的空间和光谱相关性
Anal Chem. 2005 Oct 1;77(19):6118-24. doi: 10.1021/ac051081q.
5
Comparative urine analysis by liquid chromatography-mass spectrometry and multivariate statistics: method development, evaluation, and application to proteinuria.液相色谱-质谱联用及多变量统计的尿液比较分析:方法开发、评估及在蛋白尿中的应用
J Proteome Res. 2007 Jan;6(1):194-206. doi: 10.1021/pr060362r.
6
Evaluation of Distance Metrics and Spatial Autocorrelation in Uniform Manifold Approximation and Projection Applied to Mass Spectrometry Imaging Data.基于均摊近似和投影的距离度量和空间自相关评估及其在质谱成像数据中的应用。
Anal Chem. 2019 May 7;91(9):5706-5714. doi: 10.1021/acs.analchem.8b05827. Epub 2019 Apr 25.
7
PCA based clustering for brain tumor segmentation of T1w MRI images.基于主成分分析的T1加权磁共振成像(MRI)图像脑肿瘤分割聚类方法
Comput Methods Programs Biomed. 2017 Mar;140:19-28. doi: 10.1016/j.cmpb.2016.11.011. Epub 2016 Nov 24.
8
MALDI imaging combined with hierarchical clustering as a new tool for the interpretation of complex human cancers.基质辅助激光解吸电离成像结合层次聚类作为解读复杂人类癌症的新工具。
J Proteome Res. 2008 Dec;7(12):5230-6. doi: 10.1021/pr8005777.
9
imzML: Imaging Mass Spectrometry Markup Language: A common data format for mass spectrometry imaging.imzML:成像质谱标记语言:一种用于质谱成像的通用数据格式。
Methods Mol Biol. 2011;696:205-24. doi: 10.1007/978-1-60761-987-1_12.
10
Multivariate denoising methods combining wavelets and principal component analysis for mass spectrometry data.多元去噪方法结合小波和主成分分析用于质谱数据。
Proteomics. 2010 Jul;10(14):2564-72. doi: 10.1002/pmic.200900185.

引用本文的文献

1
Processing Next-Generation Mass Spectrometry Imaging Data: Principal Component Analysis at Scale.处理下一代质谱成像数据:大规模主成分分析
J Am Soc Mass Spectrom. 2024 Dec 4;35(12):3063-3069. doi: 10.1021/jasms.4c00314. Epub 2024 Oct 28.
2
The Application of a Random Forest Classifier to ToF-SIMS Imaging Data.随机森林分类器在飞行时间二次离子质谱成像数据中的应用。
J Am Soc Mass Spectrom. 2024 Dec 4;35(12):2801-2814. doi: 10.1021/jasms.4c00324. Epub 2024 Oct 25.
3
A noise-robust deep clustering of biomolecular ions improves interpretability of mass spectrometric images.
一种抗噪的生物分子离子深度聚类提高了质谱图像的可解释性。
Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad067.
4
Peak learning of mass spectrometry imaging data using artificial neural networks.利用人工神经网络提高质谱成像数据的学习效率。
Nat Commun. 2021 Sep 20;12(1):5544. doi: 10.1038/s41467-021-25744-8.
5
Unsupervised machine learning for exploratory data analysis in imaging mass spectrometry.无监督机器学习在成像质谱分析中的探索性数据分析。
Mass Spectrom Rev. 2020 May;39(3):245-291. doi: 10.1002/mas.21602. Epub 2019 Oct 11.
6
Chemometric Strategies for Sensitive Annotation and Validation of Anatomical Regions of Interest in Complex Imaging Mass Spectrometry Data.化学计量学策略在复杂成像质谱数据中对感兴趣的解剖区域进行敏感注释和验证。
J Am Soc Mass Spectrom. 2019 Nov;30(11):2278-2288. doi: 10.1007/s13361-019-02327-y. Epub 2019 Sep 16.
7
Deep data analysis via physically constrained linear unmixing: universal framework, domain examples, and a community-wide platform.通过物理约束线性解混进行深度数据分析:通用框架、领域示例及全社区平台
Adv Struct Chem Imaging. 2018;4(1):6. doi: 10.1186/s40679-018-0055-8. Epub 2018 Apr 30.
8
Interactive Visual Exploration of 3D Mass Spectrometry Imaging Data Using Hierarchical Stochastic Neighbor Embedding Reveals Spatiomolecular Structures at Full Data Resolution.使用分层随机邻居嵌入的交互式 3D 质谱成像数据可视化探索揭示全数据分辨率下的空间分子结构。
J Proteome Res. 2018 Mar 2;17(3):1054-1064. doi: 10.1021/acs.jproteome.7b00725. Epub 2018 Feb 15.
9
Label-free molecular imaging of the kidney.肾脏无标记分子成像。
Kidney Int. 2017 Sep;92(3):580-598. doi: 10.1016/j.kint.2017.03.052. Epub 2017 Jul 24.
10
Associations between host gene expression, the mucosal microbiome, and clinical outcome in the pelvic pouch of patients with inflammatory bowel disease.炎症性肠病患者盆腔储袋中宿主基因表达、黏膜微生物群与临床结局之间的关联。
Genome Biol. 2015 Apr 8;16(1):67. doi: 10.1186/s13059-015-0637-x.