• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于概率典范相关分析的美白方法在组学数据整合中的应用。

A whitening approach to probabilistic canonical correlation analysis for omics data integration.

机构信息

Epidemiology and Biostatistics, School of Public Health, Imperial College London, Norfolk Place, London, W2 1PG, UK.

Statistics Section, Department of Mathematics, Imperial College London, South Kensington Campus, London, SW7 2AZ, UK.

出版信息

BMC Bioinformatics. 2019 Jan 9;20(1):15. doi: 10.1186/s12859-018-2572-9.

DOI:10.1186/s12859-018-2572-9
PMID:30626338
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6327589/
Abstract

BACKGROUND

Canonical correlation analysis (CCA) is a classic statistical tool for investigating complex multivariate data. Correspondingly, it has found many diverse applications, ranging from molecular biology and medicine to social science and finance. Intriguingly, despite the importance and pervasiveness of CCA, only recently a probabilistic understanding of CCA is developing, moving from an algorithmic to a model-based perspective and enabling its application to large-scale settings.

RESULTS

Here, we revisit CCA from the perspective of statistical whitening of random variables and propose a simple yet flexible probabilistic model for CCA in the form of a two-layer latent variable generative model. The advantages of this variant of probabilistic CCA include non-ambiguity of the latent variables, provisions for negative canonical correlations, possibility of non-normal generative variables, as well as ease of interpretation on all levels of the model. In addition, we show that it lends itself to computationally efficient estimation in high-dimensional settings using regularized inference. We test our approach to CCA analysis in simulations and apply it to two omics data sets illustrating the integration of gene expression data, lipid concentrations and methylation levels.

CONCLUSIONS

Our whitening approach to CCA provides a unifying perspective on CCA, linking together sphering procedures, multivariate regression and corresponding probabilistic generative models. Furthermore, we offer an efficient computer implementation in the "whitening" R package available at https://CRAN.R-project.org/package=whitening .

摘要

背景

典型相关分析(CCA)是一种用于研究复杂多元数据的经典统计工具。相应地,它已经找到了许多不同的应用,从分子生物学和医学到社会科学和金融。有趣的是,尽管 CCA 非常重要且普遍存在,但直到最近,才开始从算法角度发展出对 CCA 的概率理解,并将其应用于大规模场景。

结果

在这里,我们从随机变量统计白化的角度重新审视 CCA,并提出了一种简单而灵活的 CCA 概率模型,其形式为两层潜在变量生成模型。这种概率 CCA 的变体的优点包括潜在变量的非模糊性、允许存在负典型相关、生成变量可以是非正态的,以及在模型的所有层面上易于解释。此外,我们表明,它可以在高维环境中使用正则化推断进行计算高效的估计。我们在模拟中测试了我们的 CCA 分析方法,并将其应用于两个组学数据集,说明了基因表达数据、脂质浓度和甲基化水平的整合。

结论

我们的 CCA 白化方法为 CCA 提供了一个统一的视角,将球形化过程、多元回归和相应的概率生成模型联系在一起。此外,我们在“白化”R 包(可在 https://CRAN.R-project.org/package=whitening 获得)中提供了一种有效的计算机实现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/11f2b16f2bd8/12859_2018_2572_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/4e65aac77ea1/12859_2018_2572_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/98801d00212b/12859_2018_2572_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/5ab5ce1c3d99/12859_2018_2572_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/47235b4a767a/12859_2018_2572_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/a49aa7b026cc/12859_2018_2572_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/26022336fb7a/12859_2018_2572_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/cf4c5c4db3a1/12859_2018_2572_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/61214ebd15eb/12859_2018_2572_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/9cef75c294bd/12859_2018_2572_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/11f2b16f2bd8/12859_2018_2572_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/4e65aac77ea1/12859_2018_2572_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/98801d00212b/12859_2018_2572_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/5ab5ce1c3d99/12859_2018_2572_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/47235b4a767a/12859_2018_2572_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/a49aa7b026cc/12859_2018_2572_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/26022336fb7a/12859_2018_2572_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/cf4c5c4db3a1/12859_2018_2572_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/61214ebd15eb/12859_2018_2572_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/9cef75c294bd/12859_2018_2572_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35a0/6327589/11f2b16f2bd8/12859_2018_2572_Fig10_HTML.jpg

相似文献

1
A whitening approach to probabilistic canonical correlation analysis for omics data integration.基于概率典范相关分析的美白方法在组学数据整合中的应用。
BMC Bioinformatics. 2019 Jan 9;20(1):15. doi: 10.1186/s12859-018-2572-9.
2
Integrative analysis of gene expression and copy number alterations using canonical correlation analysis.基于典型相关分析的基因表达和拷贝数改变的综合分析。
BMC Bioinformatics. 2010 Apr 15;11:191. doi: 10.1186/1471-2105-11-191.
3
Integrating multi-OMICS data through sparse canonical correlation analysis for the prediction of complex traits: a comparison study.通过稀疏典型相关分析整合多组学数据以预测复杂性状:一项比较研究。
Bioinformatics. 2020 Nov 1;36(17):4616-4625. doi: 10.1093/bioinformatics/btaa530.
4
Conditional canonical correlation estimation based on covariates with random forests.基于随机森林协变量的条件典型相关估计
Bioinformatics. 2021 Sep 9;37(17):2714-2721. doi: 10.1093/bioinformatics/btab158.
5
Canonical correlation analysis for multilabel classification: a least-squares formulation, extensions, and analysis.多标签分类的典范相关分析:最小二乘法公式、扩展及分析。
IEEE Trans Pattern Anal Mach Intell. 2011 Jan;33(1):194-200. doi: 10.1109/TPAMI.2010.160.
6
An iterative penalized least squares approach to sparse canonical correlation analysis.一种用于稀疏典型相关分析的迭代惩罚最小二乘法。
Biometrics. 2019 Sep;75(3):734-744. doi: 10.1111/biom.13043. Epub 2019 Apr 9.
7
A learning algorithm for adaptive canonical correlation analysis of several data sets.一种用于多个数据集自适应典型相关分析的学习算法。
Neural Netw. 2007 Jan;20(1):139-52. doi: 10.1016/j.neunet.2006.09.011. Epub 2006 Nov 17.
8
Oblique rotaton in canonical correlation analysis reformulated as maximizing the generalized coefficient of determination.典型相关分析中的斜旋转被重新表述为最大化广义决定系数。
Psychometrika. 2013 Jul;78(3):526-37. doi: 10.1007/s11336-012-9310-4. Epub 2012 Dec 8.
9
Robust sparse canonical correlation analysis.稳健稀疏典型相关分析
BMC Syst Biol. 2016 Aug 11;10(1):72. doi: 10.1186/s12918-016-0317-9.
10
iSFun: an R package for integrative dimension reduction analysis.iSFun:一个用于整合维度缩减分析的 R 包。
Bioinformatics. 2022 May 26;38(11):3134-3135. doi: 10.1093/bioinformatics/btac281.

引用本文的文献

1
Second-order threat conditioning in the amygdala-posterior piriform cortex network.杏仁核-梨状后皮质网络中的二阶威胁条件作用
Commun Biol. 2025 Jun 2;8(1):846. doi: 10.1038/s42003-025-08287-2.
2
Inferring directed spectral information flow between mixed-frequency time series.推断混合频率时间序列之间的定向频谱信息流。
Res Sq. 2025 Feb 28:rs.3.rs-4926819. doi: 10.21203/rs.3.rs-4926819/v1.
3
A guided network estimation approach using multi-omic information.基于多组学信息的引导网络估计方法。

本文引用的文献

1
TCGA2STAT: simple TCGA data access for integrated statistical analysis in R.TCGA2STAT:用于 R 中集成统计分析的简单 TCGA 数据访问。
Bioinformatics. 2016 Mar 15;32(6):952-4. doi: 10.1093/bioinformatics/btv677. Epub 2015 Nov 14.
2
Sparse canonical correlation analysis from a predictive point of view.从预测角度看稀疏典型相关分析。
Biom J. 2015 Sep;57(5):834-51. doi: 10.1002/bimj.201400226. Epub 2015 Jul 6.
3
Mutational landscape and significance across 12 major cancer types.12 种主要癌症类型的突变特征及意义。
BMC Bioinformatics. 2024 May 30;25(1):202. doi: 10.1186/s12859-024-05778-7.
4
Application of Mass Cytometry Platforms to Solid Organ Transplantation.应用液质联用平台进行实体器官移植。
Transplantation. 2024 Oct 1;108(10):2034-2044. doi: 10.1097/TP.0000000000004925. Epub 2024 Mar 12.
5
Multivariate analytical approaches for investigating brain-behavior relationships.用于研究脑-行为关系的多变量分析方法。
Front Neurosci. 2023 Jul 31;17:1175690. doi: 10.3389/fnins.2023.1175690. eCollection 2023.
6
MOBILE pipeline enables identification of context-specific networks and regulatory mechanisms.MOBILE 管道能够识别特定上下文的网络和调控机制。
Nat Commun. 2023 Jul 6;14(1):3991. doi: 10.1038/s41467-023-39729-2.
7
SDGCCA: Supervised Deep Generalized Canonical Correlation Analysis for Multi-Omics Integration.SDGCCA:用于多组学整合的有监督深度广义典型相关分析。
J Comput Biol. 2022 Aug;29(8):892-907. doi: 10.1089/cmb.2021.0598.
8
The Effect of Neuroepo on Cognition in Parkinson's Disease Patients Is Mediated by Electroencephalogram Source Activity.神经促红细胞生成素对帕金森病患者认知功能的影响由脑电图源活动介导。
Front Neurosci. 2022 Jun 30;16:841428. doi: 10.3389/fnins.2022.841428. eCollection 2022.
9
Severe COVID-19 Shares a Common Neutrophil Activation Signature with Other Acute Inflammatory States.严重 COVID-19 与其他急性炎症状态具有共同的中性粒细胞激活特征。
Cells. 2022 Mar 1;11(5):847. doi: 10.3390/cells11050847.
10
Autoantibodies targeting GPCRs and RAS-related molecules associate with COVID-19 severity.针对 GPCRs 和 RAS 相关分子的自身抗体与 COVID-19 严重程度相关。
Nat Commun. 2022 Mar 9;13(1):1220. doi: 10.1038/s41467-022-28905-5.
Nature. 2013 Oct 17;502(7471):333-339. doi: 10.1038/nature12634.
4
Canonical correlation analysis for RNA-seq co-expression networks.基于 RNA-seq 共表达网络的典型相关分析。
Nucleic Acids Res. 2013 Apr;41(8):e95. doi: 10.1093/nar/gkt145. Epub 2013 Mar 4.
5
A novel algorithm for simultaneous SNP selection in high-dimensional genome-wide association studies.一种用于高维全基因组关联研究中同时 SNP 选择的新算法。
BMC Bioinformatics. 2012 Oct 31;13:284. doi: 10.1186/1471-2105-13-284.
6
A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.一种惩罚矩阵分解及其在稀疏主成分分析和典型相关分析中的应用。
Biostatistics. 2009 Jul;10(3):515-34. doi: 10.1093/biostatistics/kxp008. Epub 2009 Apr 17.
7
Sparse canonical correlation analysis with application to genomic data integration.应用于基因组数据整合的稀疏典型相关分析。
Stat Appl Genet Mol Biol. 2009;8:Article 1. doi: 10.2202/1544-6115.1406. Epub 2009 Jan 6.
8
Quantifying the association between gene expressions and DNA-markers by penalized canonical correlation analysis.通过惩罚典型相关分析量化基因表达与DNA标记之间的关联。
Stat Appl Genet Mol Biol. 2008;7(1):Article3. doi: 10.2202/1544-6115.1329. Epub 2008 Jan 23.
9
Variational Bayesian approach to canonical correlation analysis.变分贝叶斯方法用于典型相关分析。
IEEE Trans Neural Netw. 2007 May;18(3):905-10. doi: 10.1109/TNN.2007.891186.
10
Novel aspects of PPARalpha-mediated regulation of lipid and xenobiotic metabolism revealed through a nutrigenomic study.通过营养基因组学研究揭示的PPARα介导的脂质和外源性物质代谢调节的新方面。
Hepatology. 2007 Mar;45(3):767-77. doi: 10.1002/hep.21510.