• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用广义对比主成分分析识别高维数据集之间的差异模式。

Identifying patterns differing between high-dimensional datasets with generalized contrastive PCA.

作者信息

de Oliveira Eliezyer Fermino, Garg Pranjal, Hjerling-Leffler Jens, Batista-Brito Renata, Sjulson Lucas

机构信息

Dominick P. Purpura Department of Neuroscience, Albert Einstein College of Medicine, Bronx, New York, United States of America.

All India Institute of Medical Sciences, Rishikesh, India.

出版信息

PLoS Comput Biol. 2025 Feb 7;21(2):e1012747. doi: 10.1371/journal.pcbi.1012747. eCollection 2025 Feb.

DOI:10.1371/journal.pcbi.1012747
PMID:39919147
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11841894/
Abstract

High-dimensional data have become ubiquitous in the biological sciences, and it is often desirable to compare two datasets collected under different experimental conditions to extract low-dimensional patterns enriched in one condition. However, traditional dimensionality reduction techniques cannot accomplish this because they operate on only one dataset. Contrastive principal component analysis (cPCA) has been proposed to address this problem, but it has seen little adoption because it requires tuning a hyperparameter resulting in multiple solutions, with no way of knowing which is correct. Moreover, cPCA uses foreground and background conditions that are treated differently, making it ill-suited to compare two experimental conditions symmetrically. Here we describe the development of generalized contrastive PCA (gcPCA), a flexible hyperparameter-free approach that solves these problems. We first provide analyses explaining why cPCA requires a hyperparameter and how gcPCA avoids this requirement. We then describe an open-source gcPCA toolbox containing Python and MATLAB implementations of several variants of gcPCA tailored for different scenarios. Finally, we demonstrate the utility of gcPCA in analyzing diverse high-dimensional biological data, revealing unsupervised detection of hippocampal replay in neurophysiological recordings and heterogeneity of type II diabetes in single-cell RNA sequencing data. As a fast, robust, and easy-to-use comparison method, gcPCA provides a valuable resource facilitating the analysis of diverse high-dimensional datasets to gain new insights into complex biological phenomena.

摘要

高维数据在生物科学中已无处不在,通常希望比较在不同实验条件下收集的两个数据集,以提取在一种条件下富集的低维模式。然而,传统的降维技术无法做到这一点,因为它们仅对一个数据集进行操作。对比主成分分析(cPCA)已被提出用于解决此问题,但它很少被采用,因为它需要调整一个超参数,这会导致多个解决方案,且无法知道哪个是正确的。此外,cPCA使用的前景和背景条件处理方式不同,使其不适用于对称地比较两个实验条件。在这里,我们描述了广义对比主成分分析(gcPCA)的发展,这是一种灵活的无超参数方法,可以解决这些问题。我们首先进行分析,解释为什么cPCA需要一个超参数以及gcPCA如何避免这一需求。然后,我们描述了一个开源的gcPCA工具箱,其中包含针对不同场景量身定制的几种gcPCA变体的Python和MATLAB实现。最后,我们展示了gcPCA在分析各种高维生物数据中的效用,揭示了在神经生理记录中对海马体重放的无监督检测以及单细胞RNA测序数据中II型糖尿病的异质性。作为一种快速、稳健且易于使用的比较方法,gcPCA提供了一种宝贵的资源,有助于分析各种高维数据集,从而对复杂的生物现象获得新的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4140/11841894/4918f972671b/pcbi.1012747.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4140/11841894/d4a8f11d414f/pcbi.1012747.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4140/11841894/d9ab795d8a43/pcbi.1012747.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4140/11841894/fa4bfe4c4a09/pcbi.1012747.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4140/11841894/4918f972671b/pcbi.1012747.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4140/11841894/d4a8f11d414f/pcbi.1012747.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4140/11841894/d9ab795d8a43/pcbi.1012747.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4140/11841894/fa4bfe4c4a09/pcbi.1012747.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4140/11841894/4918f972671b/pcbi.1012747.g004.jpg

相似文献

1
Identifying patterns differing between high-dimensional datasets with generalized contrastive PCA.使用广义对比主成分分析识别高维数据集之间的差异模式。
PLoS Comput Biol. 2025 Feb 7;21(2):e1012747. doi: 10.1371/journal.pcbi.1012747. eCollection 2025 Feb.
2
Identifying patterns differing between high-dimensional datasets with generalized contrastive PCA.使用广义对比主成分分析识别高维数据集之间的差异模式。
bioRxiv. 2024 Aug 9:2024.08.08.607264. doi: 10.1101/2024.08.08.607264.
3
Exploring patterns enriched in a dataset with contrastive principal component analysis.用对比主成分分析探索数据集内的模式富集。
Nat Commun. 2018 May 30;9(1):2134. doi: 10.1038/s41467-018-04608-8.
4
Principal component analysis-based unsupervised feature extraction applied to in silico drug discovery for posttraumatic stress disorder-mediated heart disease.基于主成分分析的无监督特征提取应用于创伤后应激障碍介导的心脏病的计算机辅助药物发现。
BMC Bioinformatics. 2015 Apr 30;16:139. doi: 10.1186/s12859-015-0574-4.
5
Independent Principal Component Analysis for biologically meaningful dimension reduction of large biological data sets.独立主成分分析在大型生物数据集的生物学有意义的降维中的应用。
BMC Bioinformatics. 2012 Feb 3;13:24. doi: 10.1186/1471-2105-13-24.
6
Spectral embedding finds meaningful (relevant) structure in image and microarray data.谱嵌入可在图像和微阵列数据中找到有意义(相关)的结构。
BMC Bioinformatics. 2006 Feb 16;7:74. doi: 10.1186/1471-2105-7-74.
7
CGRclust: Chaos Game Representation for twin contrastive clustering of unlabelled DNA sequences.CGRclust:用于未标记DNA序列双对比聚类的混沌游戏表示法
BMC Genomics. 2024 Dec 18;25(1):1214. doi: 10.1186/s12864-024-11135-y.
8
Mining gene expression data by interpreting principal components.通过解释主成分挖掘基因表达数据。
BMC Bioinformatics. 2006 Apr 7;7:194. doi: 10.1186/1471-2105-7-194.
9
Principal components analysis and the reported low intrinsic dimensionality of gene expression microarray data.主成分分析与基因表达微阵列数据所报道的低内在维度
Sci Rep. 2016 Jun 2;6:25696. doi: 10.1038/srep25696.
10
Nonlinear Dimensionality Reduction by Minimum Curvilinearity for Unsupervised Discovery of Patterns in Multidimensional Proteomic Data.基于最小曲率的非线性降维用于多维蛋白质组学数据模式的无监督发现
Methods Mol Biol. 2016;1384:289-98. doi: 10.1007/978-1-4939-3255-9_16.

本文引用的文献

1
Implicating type 2 diabetes effector genes in relevant metabolic cellular models using promoter-focused Capture-C.利用启动子聚焦捕获-C 在相关代谢细胞模型中涉及 2 型糖尿病效应基因。
Diabetologia. 2024 Dec;67(12):2740-2753. doi: 10.1007/s00125-024-06261-x. Epub 2024 Sep 6.
2
Human vascularized macrophage-islet organoids to model immune-mediated pancreatic β cell pyroptosis upon viral infection.人类血管化巨噬细胞-胰岛类器官模型模拟病毒感染时免疫介导的胰腺β细胞细胞焦亡。
Cell Stem Cell. 2024 Nov 7;31(11):1612-1629.e8. doi: 10.1016/j.stem.2024.08.007. Epub 2024 Sep 3.
3
A transcriptomic and proteomic atlas of obesity and type 2 diabetes in cynomolgus monkeys.
肥胖症和 2 型糖尿病的食蟹猴转录组学和蛋白质组学图谱
Cell Rep. 2023 Aug 29;42(8):112952. doi: 10.1016/j.celrep.2023.112952. Epub 2023 Aug 8.
4
Mitochondrial protein MPV17 promotes β-cell apoptosis in diabetogenesis.线粒体蛋白 MVP17 促进糖尿病发生中的β细胞凋亡。
Clin Sci (Lond). 2023 Aug 14;137(15):1195-1208. doi: 10.1042/CS20230164.
5
identification and verification of ferroptosis-related genes in type 2 diabetic islets.鉴定和验证 2 型糖尿病胰岛中的铁死亡相关基因。
Front Endocrinol (Lausanne). 2022 Aug 5;13:946492. doi: 10.3389/fendo.2022.946492. eCollection 2022.
6
Imeglimin Ameliorates β-Cell Apoptosis by Modulating the Endoplasmic Reticulum Homeostasis Pathway.依格列净通过调节内质网稳态通路改善β细胞凋亡。
Diabetes. 2022 Mar 1;71(3):424-439. doi: 10.2337/db21-0123.
7
/ depletion in β cells alleviates ER stress and corrects hepatic steatosis in mice.β细胞耗竭可减轻内质网应激并纠正小鼠的肝脂肪变性。
Sci Transl Med. 2021 Jul 28;13(604). doi: 10.1126/scitranslmed.aba9796.
8
The MafA-target gene PPP1R1A regulates GLP1R-mediated amplification of glucose-stimulated insulin secretion in β-cells.MafA 靶基因 PPP1R1A 调节 GLP1R 介导的β细胞中葡萄糖刺激的胰岛素分泌的放大作用。
Metabolism. 2021 May;118:154734. doi: 10.1016/j.metabol.2021.154734. Epub 2021 Feb 23.
9
Subtypes of Type 2 Diabetes Determined From Clinical Parameters.基于临床参数的 2 型糖尿病亚型。
Diabetes. 2020 Oct;69(10):2086-2093. doi: 10.2337/dbi20-0001. Epub 2020 Aug 25.
10
Pancreatic islet beta cell-specific deletion of G6pc2 reduces fasting blood glucose.胰岛β细胞特异性敲除 G6pc2 可降低空腹血糖。
J Mol Endocrinol. 2020 May;64(4):235-248. doi: 10.1530/JME-20-0031.