Du Yinhao, Fan Kun, Lu Xi, Wu Cen
Department of Statistics, Kansas State University, Manhattan, KS 66506, USA.
BioTech (Basel). 2021 Jan 29;10(1):3. doi: 10.3390/biotech10010003.
Gene-environment (G×E) interaction is critical for understanding the genetic basis of complex disease beyond genetic and environment main effects. In addition to existing tools for interaction studies, penalized variable selection emerges as a promising alternative for dissecting G×E interactions. Despite the success, variable selection is limited in terms of accounting for multidimensional measurements. Published variable selection methods cannot accommodate structured sparsity in the framework of integrating multiomics data for disease outcomes. In this paper, we have developed a novel variable selection method in order to integrate multi-omics measurements in G×E interaction studies. Extensive studies have already revealed that analyzing omics data across multi-platforms is not only sensible biologically, but also resulting in improved identification and prediction performance. Our integrative model can efficiently pinpoint important regulators of gene expressions through sparse dimensionality reduction, and link the disease outcomes to multiple effects in the integrative G×E studies through accommodating a sparse bi-level structure. The simulation studies show the integrative model leads to better identification of G×E interactions and regulators than alternative methods. In two G×E lung cancer studies with high dimensional multi-omics data, the integrative model leads to an improved prediction and findings with important biological implications.
基因-环境(G×E)相互作用对于理解复杂疾病的遗传基础至关重要,这超出了遗传和环境的主要影响。除了现有的相互作用研究工具外,惩罚变量选择作为剖析G×E相互作用的一种有前途的替代方法出现了。尽管取得了成功,但变量选择在考虑多维测量方面存在局限性。已发表的变量选择方法在整合多组学数据以研究疾病结局的框架中无法适应结构化稀疏性。在本文中,我们开发了一种新颖的变量选择方法,以便在G×E相互作用研究中整合多组学测量。广泛的研究已经表明,跨多平台分析组学数据不仅在生物学上是合理的,而且还能提高识别和预测性能。我们的整合模型可以通过稀疏降维有效地确定基因表达的重要调节因子,并通过容纳稀疏的双层结构将疾病结局与整合的G×E研究中的多种效应联系起来。模拟研究表明,与其他方法相比,整合模型能更好地识别G×E相互作用和调节因子。在两项具有高维多组学数据的G×E肺癌研究中,整合模型带来了更好的预测和具有重要生物学意义的发现。