Guo Yingjie, Wu Chenxi, Yuan Zhian, Wang Yansu, Liang Zhen, Wang Yang, Zhang Yi, Xu Lei
Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China.
School of Electronic and Communication Engineering, Shenzhen Polytechnic, Shenzhen, China.
Front Cell Dev Biol. 2021 Dec 16;9:801113. doi: 10.3389/fcell.2021.801113. eCollection 2021.
Among the myriad of statistical methods that identify gene-gene interactions in the realm of qualitative genome-wide association studies, gene-based interactions are not only powerful statistically, but also they are interpretable biologically. However, they have limited statistical detection by making assumptions on the association between traits and single nucleotide polymorphisms. Thus, a gene-based method (GGInt-XGBoost) originated from XGBoost is proposed in this article. Assuming that log odds ratio of disease traits satisfies the additive relationship if the pair of genes had no interactions, the difference in error between the XGBoost model with and without additive constraint could indicate gene-gene interaction; we then used a permutation-based statistical test to assess this difference and to provide a statistical -value to represent the significance of the interaction. Experimental results on both simulation and real data showed that our approach had superior performance than previous experiments to detect gene-gene interactions.
在定性全基因组关联研究领域中,用于识别基因-基因相互作用的众多统计方法里,基于基因的相互作用不仅在统计学上效力强大,而且在生物学上也具有可解释性。然而,它们通过对性状与单核苷酸多态性之间的关联做出假设,在统计检测方面存在局限性。因此,本文提出了一种源自XGBoost的基于基因的方法(GGInt-XGBoost)。假设如果一对基因没有相互作用,疾病性状的对数优势比满足加性关系,那么具有和不具有加性约束的XGBoost模型之间的误差差异可表明基因-基因相互作用;然后我们使用基于排列的统计检验来评估这种差异,并提供一个统计值来表示相互作用的显著性。模拟数据和真实数据的实验结果均表明,我们的方法在检测基因-基因相互作用方面比之前的实验具有更优的性能。