Suppr超能文献

通过生物约束的基因组到表型贝叶斯稀疏回归解码多表型层的遗传标记

Decoding Genetic Markers of Multiple Phenotypic Layers Through Biologically Constrained Genome-To-Phenome Bayesian Sparse Regression.

作者信息

Deprez Marie, Moreira Julien, Sermesant Maxime, Lorenzi Marco

机构信息

University of Côte d'Azur, Nice, France.

INRIA, Epione Project-Team, Valbonne, France.

出版信息

Front Mol Med. 2022 Mar 30;2:830956. doi: 10.3389/fmmed.2022.830956. eCollection 2022.

Abstract

The applicability of multivariate approaches for the joint analysis of genomics and phenomics information is currently limited by the lack of scalability, and by the difficulty of interpreting the related findings from a biological perspective. To tackle these limitations, we present Bayesian Genome-to-Phenome Sparse Regression (G2PSR), a novel multivariate regression method based on sparse SNP-gene constraints. The statistical framework of G2PSR is based on a Bayesian neural network, were constraints on SNPs-genes associations are integrated by incorporating knowledge linking variants to their respective genes, to then reconstruct the phenotypic data in the output layer. Interpretability is promoted by inducing sparsity on the genes through variational dropout, allowing to estimate the uncertainty associated with each gene, and related SNPs, in the reconstruction task. Ultimately, G2PSR is conceived to prevent multiple testing correction and to assess the combined effect of SNPs, thus increasing the statistical power in detecting genome-to-phenome associations. The effectiveness of G2PSR was demonstrated on synthetic and real data, with respect to state-of-the-art methods based on group-wise sparsity constraints. The application on real data consisted in an imaging-genetics analysis on the Alzheimer's Disease Neuroimaging Initiative data, relating SNPs from more than 3,500 genes to clinical and multi-variate brain volumetric information. The experimental results show that our method can provide accurate selection of relevant genes in dataset with large SNPs-to-samples ratio, thus overcoming the main limitations of current genome-to-phenome association methods.

摘要

目前,多变量方法在基因组学和表型组学信息联合分析中的适用性受到缺乏可扩展性以及从生物学角度解释相关发现的困难的限制。为了解决这些限制,我们提出了贝叶斯基因组到表型组稀疏回归(G2PSR),这是一种基于稀疏单核苷酸多态性(SNP)-基因约束的新型多变量回归方法。G2PSR的统计框架基于贝叶斯神经网络,通过纳入将变异与其各自基因联系起来的知识,整合对SNP-基因关联的约束,然后在输出层重建表型数据。通过变分失活在基因上引入稀疏性来促进可解释性,从而能够在重建任务中估计与每个基因以及相关SNP相关的不确定性。最终,G2PSR旨在避免多重检验校正并评估SNP的联合效应,从而提高检测基因组到表型组关联的统计能力。相对于基于组稀疏性约束的现有方法,G2PSR在合成数据和真实数据上都证明了其有效性。在真实数据上的应用包括对阿尔茨海默病神经影像倡议数据进行影像遗传学分析,将来自3500多个基因的SNP与临床和多变量脑容量信息相关联。实验结果表明,我们的方法能够在SNP与样本比例较大的数据集中准确选择相关基因,从而克服了当前基因组到表型组关联方法的主要局限性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f750/11285669/ac2fafabcd7c/fmmed-02-830956-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验