Suppr超能文献

大样本量和非线性稀疏模型概述了炎症性肠病中的上位效应。

Large sample size and nonlinear sparse models outline epistatic effects in inflammatory bowel disease.

机构信息

Department of of Electrical Engineering, Katholieke Universiteit Leuven, Leuven, Belgium.

出版信息

Genome Biol. 2023 Oct 5;24(1):224. doi: 10.1186/s13059-023-03064-y.

Abstract

BACKGROUND

Despite clear evidence of nonlinear interactions in the molecular architecture of polygenic diseases, linear models have so far appeared optimal in genotype-to-phenotype modeling. A key bottleneck for such modeling is that genetic data intrinsically suffers from underdetermination ([Formula: see text]). Millions of variants are present in each individual while the collection of large, homogeneous cohorts is hindered by phenotype incidence, sequencing cost, and batch effects.

RESULTS

We demonstrate that when we provide enough training data and control the complexity of nonlinear models, a neural network outperforms additive approaches in whole exome sequencing-based inflammatory bowel disease case-control prediction. To do so, we propose a biologically meaningful sparsified neural network architecture, providing empirical evidence for positive and negative epistatic effects present in the inflammatory bowel disease pathogenesis.

CONCLUSIONS

In this paper, we show that underdetermination is likely a major driver for the apparent optimality of additive modeling in clinical genetics today.

摘要

背景

尽管多基因疾病的分子结构中存在明显的非线性相互作用,但线性模型在基因型到表型建模中似乎迄今为止一直是最优的。这种建模的一个关键瓶颈是遗传数据本质上存在欠定问题([公式:见正文])。每个人的个体中都存在数百万个变体,而大型、同质队列的收集受到表型发生率、测序成本和批次效应的阻碍。

结果

我们证明,当我们提供足够的训练数据并控制非线性模型的复杂性时,神经网络在基于全外显子组测序的炎症性肠病病例对照预测中的表现优于加性方法。为此,我们提出了一种具有生物学意义的稀疏神经网络架构,为炎症性肠病发病机制中存在的正和负上位效应提供了经验证据。

结论

在本文中,我们表明,欠定问题很可能是当今临床遗传学中加性建模表现出明显最优性的主要驱动因素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b7a1/10552306/c76d318a5b90/13059_2023_3064_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验