Cheng Wei, Ramachandran Sohini, Crawford Lorin
Department of Computer Science, Brown University, Providence, RI, USA.
Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA.
iScience. 2022 Jun 7;25(7):104553. doi: 10.1016/j.isci.2022.104553. eCollection 2022 Jul 15.
In this paper, we propose a new approach for variable selection using a collection of Bayesian neural networks with a focus on quantifying uncertainty over which variables are selected. Motivated by fine-mapping applications in statistical genetics, we refer to our framework as an "ensemble of single-effect neural networks" (ESNN) which generalizes the "sum of single effects" regression framework by both accounting for nonlinear structure in genotypic data (e.g., dominance effects) and having the capability to model discrete phenotypes (e.g., case-control studies). Through extensive simulations, we demonstrate our method's ability to produce calibrated posterior summaries such as credible sets and posterior inclusion probabilities, particularly for traits with genetic architectures that have significant proportions of non-additive variation driven by correlated variants. Lastly, we use real data to demonstrate that the ESNN framework improves upon the state of the art for identifying true effect variables underlying various complex traits.
在本文中,我们提出了一种使用贝叶斯神经网络集合进行变量选择的新方法,重点是量化所选变量的不确定性。受统计遗传学中精细定位应用的启发,我们将我们的框架称为“单效应神经网络集成”(ESNN),它通过考虑基因型数据中的非线性结构(例如显性效应)并具有对离散表型进行建模的能力(例如病例对照研究),推广了“单效应之和”回归框架。通过广泛的模拟,我们证明了我们的方法能够生成经过校准的后验汇总,如实可信集和后验包含概率,特别是对于具有由相关变体驱动的显著比例非加性变异的遗传结构的性状。最后,我们使用真实数据证明,ESNN框架在识别各种复杂性状潜在的真实效应变量方面改进了现有技术水平。