Department of Computer Science, University of California, Los Angeles, Los Angeles, California 90095, USA;
Department of Computer Science, University of California, Los Angeles, Los Angeles, California 90095, USA.
Genome Res. 2024 Oct 11;34(9):1294-1303. doi: 10.1101/gr.279140.124.
Our knowledge of the contribution of genetic interactions () to variation in human complex traits remains limited, partly due to the lack of efficient, powerful, and interpretable algorithms to detect interactions. Recently proposed approaches for set-based association tests show promise in improving the power to detect epistasis by examining the aggregated effects of multiple variants. Nevertheless, these methods either do not scale to large Biobank data sets or lack interpretability. We propose QuadKAST, a scalable algorithm focused on testing pairwise interaction effects () within small to medium-sized sets of genetic variants (window size ≤100) on a trait and provide quantified interpretation of these effects. Comprehensive simulations show that QuadKAST is well-calibrated. Additionally, QuadKAST is highly sensitive in detecting loci with epistatic signals and accurate in its estimation of quadratic effects. We applied QuadKAST to 52 quantitative phenotypes measured in ≈300,000 unrelated white British individuals in the UK Biobank to test for quadratic effects within each of 9515 protein-coding genes. We detect 32 trait-gene pairs across 17 traits and 29 genes that demonstrate statistically significant signals of quadratic effects (accounting for the number of genes and traits tested). Across these trait-gene pairs, the proportion of trait variance explained by quadratic effects is comparable to additive effects, with five pairs having a ratio >1. Our method enables the detailed investigation of epistasis on a large scale, offering new insights into its role and importance.
我们对遗传相互作用()对人类复杂特征变异的贡献的了解仍然有限,部分原因是缺乏有效、强大且可解释的算法来检测相互作用。最近提出的基于集合的关联测试方法有望通过检查多个变体的聚合效应来提高检测上位性的能力。然而,这些方法要么不能扩展到大型生物库数据集,要么缺乏可解释性。我们提出了 QuadKAST,这是一种可扩展的算法,专注于在特质的小到中等大小的遗传变体集合(窗口大小≤100)内测试成对的交互作用效应(),并提供这些效应的量化解释。综合模拟表明 QuadKAST 具有良好的校准能力。此外,QuadKAST 在检测具有上位信号的基因座方面非常敏感,并且在其二次效应的估计方面非常准确。我们将 QuadKAST 应用于英国生物库中约 30 万无关的白种英国人的 52 个定量表型,以测试 9515 个蛋白质编码基因中每个基因的二次效应。我们在 17 个特征和 29 个基因中检测到 32 个特征-基因对,这些基因对显示出二次效应的统计学显著信号(考虑到测试的基因和特征数量)。在这些特征-基因对中,二次效应解释的特征方差比例与加性效应相当,其中五个对的比例>1。我们的方法能够大规模详细研究上位性,为其作用和重要性提供新的见解。