通过生物约束的基因组到表型贝叶斯稀疏回归解码多表型层的遗传标记

Decoding Genetic Markers of Multiple Phenotypic Layers Through Biologically Constrained Genome-To-Phenome Bayesian Sparse Regression.

作者信息

Deprez Marie, Moreira Julien, Sermesant Maxime, Lorenzi Marco

机构信息

University of Côte d'Azur, Nice, France.

INRIA, Epione Project-Team, Valbonne, France.

出版信息

Front Mol Med. 2022 Mar 30;2:830956. doi: 10.3389/fmmed.2022.830956. eCollection 2022.

DOI:10.3389/fmmed.2022.830956

PMID:39086978

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11285669/

Abstract

The applicability of multivariate approaches for the joint analysis of genomics and phenomics information is currently limited by the lack of scalability, and by the difficulty of interpreting the related findings from a biological perspective. To tackle these limitations, we present Bayesian Genome-to-Phenome Sparse Regression (G2PSR), a novel multivariate regression method based on sparse SNP-gene constraints. The statistical framework of G2PSR is based on a Bayesian neural network, were constraints on SNPs-genes associations are integrated by incorporating knowledge linking variants to their respective genes, to then reconstruct the phenotypic data in the output layer. Interpretability is promoted by inducing sparsity on the genes through variational dropout, allowing to estimate the uncertainty associated with each gene, and related SNPs, in the reconstruction task. Ultimately, G2PSR is conceived to prevent multiple testing correction and to assess the combined effect of SNPs, thus increasing the statistical power in detecting genome-to-phenome associations. The effectiveness of G2PSR was demonstrated on synthetic and real data, with respect to state-of-the-art methods based on group-wise sparsity constraints. The application on real data consisted in an imaging-genetics analysis on the Alzheimer's Disease Neuroimaging Initiative data, relating SNPs from more than 3,500 genes to clinical and multi-variate brain volumetric information. The experimental results show that our method can provide accurate selection of relevant genes in dataset with large SNPs-to-samples ratio, thus overcoming the main limitations of current genome-to-phenome association methods.

摘要

目前，多变量方法在基因组学和表型组学信息联合分析中的适用性受到缺乏可扩展性以及从生物学角度解释相关发现的困难的限制。为了解决这些限制，我们提出了贝叶斯基因组到表型组稀疏回归（G2PSR），这是一种基于稀疏单核苷酸多态性（SNP）-基因约束的新型多变量回归方法。G2PSR的统计框架基于贝叶斯神经网络，通过纳入将变异与其各自基因联系起来的知识，整合对SNP-基因关联的约束，然后在输出层重建表型数据。通过变分失活在基因上引入稀疏性来促进可解释性，从而能够在重建任务中估计与每个基因以及相关SNP相关的不确定性。最终，G2PSR旨在避免多重检验校正并评估SNP的联合效应，从而提高检测基因组到表型组关联的统计能力。相对于基于组稀疏性约束的现有方法，G2PSR在合成数据和真实数据上都证明了其有效性。在真实数据上的应用包括对阿尔茨海默病神经影像倡议数据进行影像遗传学分析，将来自3500多个基因的SNP与临床和多变量脑容量信息相关联。实验结果表明，我们的方法能够在SNP与样本比例较大的数据集中准确选择相关基因，从而克服了当前基因组到表型组关联方法的主要局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f750/11285669/ac2fafabcd7c/fmmed-02-830956-g001.jpg

相似文献

Decoding Genetic Markers of Multiple Phenotypic Layers Through Biologically Constrained Genome-To-Phenome Bayesian Sparse Regression.

Front Mol Med. 2022 Mar 30;2:830956. doi: 10.3389/fmmed.2022.830956. eCollection 2022.

An integrative functional genomics framework for effective identification of novel regulatory variants in genome-phenome studies.

Genome Med. 2018 Jan 29;10(1):7. doi: 10.1186/s13073-018-0513-x.

A multi-task SCCA method for brain imaging genetics and its application in neurodegenerative diseases.

Comput Methods Programs Biomed. 2023 Apr;232:107450. doi: 10.1016/j.cmpb.2023.107450. Epub 2023 Mar 3.

Detecting Biomarkers of Alzheimer's Disease Based on Multi-constrained Uncertainty-Aware Adaptive Sparse Multi-view Canonical Correlation Analysis.

J Mol Neurosci. 2022 Apr;72(4):841-865. doi: 10.1007/s12031-021-01963-y. Epub 2022 Jan 26.

Finding genome-transcriptome-phenome association with structured association mapping and visualization in GenAMap.

Pac Symp Biocomput. 2012:327-38.

Brain-Wide Genome-Wide Association Study for Alzheimer's Disease via Joint Projection Learning and Sparse Regression Model.

IEEE Trans Biomed Eng. 2019 Jan;66(1):165-175. doi: 10.1109/TBME.2018.2824725. Epub 2018 Apr 9.

Low-Rank Graph-Regularized Structured Sparse Regression for Identifying Genetic Biomarkers.

IEEE Trans Big Data. 2017 Oct-Dec;3(4):405-414. doi: 10.1109/TBDATA.2017.2735991. Epub 2017 Aug 4.

A Bayesian group sparse multi-task regression model for imaging genetics.

Bioinformatics. 2017 Aug 15;33(16):2513-2522. doi: 10.1093/bioinformatics/btx215.

Neuroimaging feature extraction using a neural network classifier for imaging genetics.

BMC Bioinformatics. 2023 Jun 30;24(1):271. doi: 10.1186/s12859-023-05394-x.

Comparison of variants of canonical correlation analysis and partial least squares for combined analysis of MRI and genetic data.

Neuroimage. 2015 Feb 15;107:289-310. doi: 10.1016/j.neuroimage.2014.12.025. Epub 2014 Dec 17.

本文引用的文献

LassoNet: Neural Networks with Feature Sparsity.

Proc Mach Learn Res. 2021 Apr;130:10-18.

Differentially expressed genes in Alzheimer's disease highlighting the roles of microglia genes including OLR1 and astrocyte gene CDK2AP1.

Brain Behav Immun Health. 2021 Feb 24;13:100227. doi: 10.1016/j.bbih.2021.100227. eCollection 2021 May.

G protein-coupled receptor kinases are associated with Alzheimer's disease pathology.

Neuropathol Appl Neurobiol. 2021 Dec;47(7):942-957. doi: 10.1111/nan.12742. Epub 2021 Jul 19.

APOE and Alzheimer's Disease: From Lipid Transport to Physiopathology and Therapeutics.

Front Neurosci. 2021 Feb 17;15:630502. doi: 10.3389/fnins.2021.630502. eCollection 2021.

In Vivo Chimeric Alzheimer's Disease Modeling of Apolipoprotein E4 Toxicity in Human Neurons.

Cell Rep. 2020 Jul 28;32(4):107962. doi: 10.1016/j.celrep.2020.107962.

Brain Imaging Genomics: Integrated Analysis and Machine Learning.

Proc IEEE Inst Electr Electron Eng. 2020 Jan;108(1):125-162. doi: 10.1109/JPROC.2019.2947272. Epub 2019 Oct 29.

Adipose-derived mesenchymal stem cells attenuate ischemic brain injuries in rats by modulating miR-21-3p/MAT2B signaling transduction.

Croat Med J. 2019 Oct 31;60(5):439-448. doi: 10.3325/cmj.2019.60.439.

Tau interacts with SHP2 in neuronal systems and in Alzheimer's disease brains.

J Cell Sci. 2019 Jul 15;132(14):jcs229054. doi: 10.1242/jcs.229054.

Benefits and limitations of genome-wide association studies.

Nat Rev Genet. 2019 Aug;20(8):467-484. doi: 10.1038/s41576-019-0127-1.

The Major Risk Factors for Alzheimer's Disease: Age, Sex, and Genes Modulate the Microglia Response to Aβ Plaques.

Cell Rep. 2019 Apr 23;27(4):1293-1306.e6. doi: 10.1016/j.celrep.2019.03.099.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过生物约束的基因组到表型贝叶斯稀疏回归解码多表型层的遗传标记

Decoding Genetic Markers of Multiple Phenotypic Layers Through Biologically Constrained Genome-To-Phenome Bayesian Sparse Regression.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

本文引用的文献