Suppr超能文献

对负责复杂性状的因果基因集进行生成式预测。

Generative prediction of causal gene sets responsible for complex traits.

作者信息

Kuznets-Speck Benjamin, Ogonor Buduka K, Wytock Thomas P, Motter Adilson E

机构信息

Department of Physics and Astronomy, Northwestern University, Evanston, IL 60208.

Center for Network Dynamics, Northwestern University, Evanston, IL 60208.

出版信息

Proc Natl Acad Sci U S A. 2025 Jun 17;122(24):e2415071122. doi: 10.1073/pnas.2415071122. Epub 2025 Jun 12.

Abstract

The relationship between genotype and phenotype remains an outstanding question for organism-level traits because these traits are generally . The challenge arises from complex traits being determined by a combination of multiple genes (or loci), which leads to an explosion of possible genotype-phenotype mappings. The primary techniques to resolve these mappings are genome/transcriptome-wide association studies, which are limited by their lack of causal inference and statistical power. Here, we develop an approach that combines transcriptional data endowed with causal information and a generative machine learning model designed to strengthen statistical power. Our implementation of the approach-dubbed transcriptome-wide conditional variational autoencoder (TWAVE)-includes a variational autoencoder trained on human transcriptional data, which is incorporated into an optimization framework. Given a trait phenotype, TWAVE generates expression profiles, which we dimensionally reduce by identifying independently varying generalized pathways (eigengenes). We then conduct constrained optimization to find causal gene sets that are the gene perturbations whose measured transcriptomic responses best explain trait phenotype differences. By considering several complex traits, we show that the approach identifies causal genes that cannot be detected by the primary existing techniques. Moreover, the approach identifies complex diseases caused by distinct sets of genes, meaning that the disease is polygenic exhibits distinct subtypes driven by different genotype-phenotype mappings. We suggest that the approach will enable the design of tailored experiments to identify multigenic targets to address complex diseases.

摘要

基因型与表型之间的关系对于生物体水平的性状而言仍是一个悬而未决的问题,因为这些性状通常……挑战源于复杂性状是由多个基因(或基因座)共同决定的,这导致可能的基因型-表型映射数量激增。解决这些映射的主要技术是全基因组/转录组关联研究,但其缺乏因果推断能力和统计效力。在此,我们开发了一种方法,该方法将具有因果信息的转录数据与旨在增强统计效力的生成式机器学习模型相结合。我们将这种方法的实现称为全转录组条件变分自编码器(TWAVE),它包括一个在人类转录数据上训练的变分自编码器,并将其纳入一个优化框架。给定一个性状表型,TWAVE生成表达谱,我们通过识别独立变化的广义通路(特征基因)对其进行降维。然后我们进行约束优化,以找到因果基因集,即那些其测量的转录组反应能最好地解释性状表型差异的基因扰动。通过考虑几种复杂性状,我们表明该方法能够识别现有主要技术无法检测到的因果基因。此外,该方法还能识别由不同基因集引起的复杂疾病,这意味着该疾病是多基因的,表现为由不同基因型-表型映射驱动的不同亚型。我们认为该方法将有助于设计定制实验,以识别多基因靶点来解决复杂疾病。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83bc/12184495/f10fbf253f02/pnas.2415071122fig01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验