Suppr超能文献

GRIA:用于综合分析的图形正则化

GRIA: Graphical Regularization for Integrative Analysis.

作者信息

Chang Changgee, Oh Jihwan, Long Qi

机构信息

Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania.

出版信息

Proc SIAM Int Conf Data Min. 2020;2020:604-612. doi: 10.1137/1.9781611976236.68.

Abstract

Integrative analysis jointly analyzes multiple data sets to overcome curse of dimensionality. It can detect important but weak signals by jointly selecting features for all data sets, but unfortunately the sets of important features are not always the same for all data sets. Variations which allows heterogeneous sparsity structure-a subset of data sets can have a zero coefficient for a selected feature-have been proposed, but it compromises the effect of integrative analysis recalling the problem of losing weak important signals. We propose a new integrative analysis approach which not only aggregates weak important signals well in homogeneity setting but also substantially alleviates the problem of losing weak important signals in heterogeneity setting. Our approach exploits a priori known graphical structure of features by forcing joint selection of adjacent features, and integrating such information over multiple data sets can increase the power while taking into account the heterogeneity across data sets. We confirm the problem of existing approaches and demonstrate the superiority of our method through a simulation study and an application to gene expression data from ADNI.

摘要

整合分析通过联合分析多个数据集来克服维度灾难。它可以通过为所有数据集联合选择特征来检测重要但微弱的信号,但遗憾的是,对于所有数据集而言,重要特征集并不总是相同的。已经提出了允许异质稀疏结构的变体——数据集的一个子集对于选定特征可以具有零系数——但这会损害整合分析的效果,让人想起丢失微弱重要信号的问题。我们提出了一种新的整合分析方法,该方法不仅能在同质性设置中很好地聚合微弱重要信号,而且能在异质性设置中大幅缓解丢失微弱重要信号的问题。我们的方法通过强制联合选择相邻特征来利用先验已知的特征图形结构,并且在考虑数据集间异质性的同时,跨多个数据集整合此类信息可以提高功效。我们通过模拟研究以及对来自ADNI的基因表达数据的应用,证实了现有方法存在的问题,并证明了我们方法的优越性。

相似文献

1
GRIA: Graphical Regularization for Integrative Analysis.
Proc SIAM Int Conf Data Min. 2020;2020:604-612. doi: 10.1137/1.9781611976236.68.
2
Integrative Learning of Structured High-Dimensional Data from Multiple Datasets.
Stat Anal Data Min. 2023 Apr;16(2):120-134. doi: 10.1002/sam.11601. Epub 2022 Nov 8.
3
Promoting similarity of model sparsity structures in integrative analysis of cancer genetic data.
Stat Med. 2017 Feb 10;36(3):509-559. doi: 10.1002/sim.7138. Epub 2016 Sep 25.
4
Bayesian variable selection with graphical structure learning: Applications in integrative genomics.
PLoS One. 2018 Jul 30;13(7):e0195070. doi: 10.1371/journal.pone.0195070. eCollection 2018.
6
Effective Discriminative Feature Selection With Nontrivial Solution.
IEEE Trans Neural Netw Learn Syst. 2016 Apr;27(4):796-808. doi: 10.1109/TNNLS.2015.2424721. Epub 2015 May 14.
8
Feature Selection in the Tensor Product Feature Space.
Proc IEEE Int Conf Data Min. 2009:1004-1009. doi: 10.1109/ICDM.2009.101.
9
Weak signals in high-dimension regression: detection, estimation and prediction.
Appl Stoch Models Bus Ind. 2019 Mar-Apr;35(2):283-298. doi: 10.1002/asmb.2340. Epub 2018 May 25.

引用本文的文献

1
Integrative Learning of Structured High-Dimensional Data from Multiple Datasets.
Stat Anal Data Min. 2023 Apr;16(2):120-134. doi: 10.1002/sam.11601. Epub 2022 Nov 8.

本文引用的文献

1
Generalized Bayesian Factor Analysis for Integrative Clustering with Applications to Multi-Omics Data.
Proc Int Conf Data Sci Adv Anal. 2018 Oct;2018:109-119. doi: 10.1109/DSAA.2018.00021. Epub 2019 Feb 4.
2
Bayesian generalized biclustering analysis via adaptive structured shrinkage.
Biostatistics. 2020 Jul 1;21(3):610-624. doi: 10.1093/biostatistics/kxy081.
3
Promoting Similarity of Sparsity Structures in Integrative Analysis with Penalization.
J Am Stat Assoc. 2017;112(517):342-350. doi: 10.1080/01621459.2016.1139497. Epub 2017 May 3.
4
Scalable Bayesian variable selection for structured high-dimensional data.
Biometrics. 2018 Dec;74(4):1372-1382. doi: 10.1111/biom.12882. Epub 2018 May 8.
5
Sparse Regression Incorporating Graphical Structure among Predictors.
J Am Stat Assoc. 2016;111(514):707-720. doi: 10.1080/01621459.2015.1034319. Epub 2016 Aug 18.
6
KEGG: new perspectives on genomes, pathways, diseases and drugs.
Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361. doi: 10.1093/nar/gkw1092. Epub 2016 Nov 28.
7
Integrative Analysis of "-Omics" Data Using Penalty Functions.
Wiley Interdiscip Rev Comput Stat. 2015 Jan-Feb;7(1):99-108. doi: 10.1002/wics.1322.
9
Meta-analysis based variable selection for gene expression data.
Biometrics. 2014 Dec;70(4):872-80. doi: 10.1111/biom.12213. Epub 2014 Sep 5.
10
Integrative analysis of prognosis data on multiple cancer subtypes.
Biometrics. 2014 Sep;70(3):480-8. doi: 10.1111/biom.12177. Epub 2014 Apr 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验