IEEE/ACM Trans Comput Biol Bioinform. 2022 Mar-Apr;19(2):912-926. doi: 10.1109/TCBB.2020.3030312. Epub 2022 Apr 1.
Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the same purpose, however, makes it increasingly difficult for scientists to decide which method is more suitable for their studies. This work compares the different epistasis detection methods published during the last decade in terms of runtime, detection power and type I error rate, with a special emphasis on high-order interactions. Results show that in terms of detection power, the only methods that perform well across all experiments are the exhaustive methods, although their computational cost may be prohibitive in large-scale studies. Regarding non-exhaustive methods, not one could consistently find epistasis interactions when marginal effects are absent. If marginal effects are present, there are methods that perform well for high-order interactions, such as BADTrees, FDHE-IW, SingleMI or SNPHarvester. As for false-positive control, only SNPHarvester, FDHE-IW and DCHE show good results. The study concludes that there is no single epistasis detection method to recommend in all scenarios. Authors should prioritize exhaustive methods when sufficient computational resources are available considering the data set size, and resort to non-exhaustive methods when the analysis time is prohibitive.
当表现出表型时,发现基因座之间的上位性相互作用是一种广泛用于理解 GWAS 中复杂性状遗传结构的策略。然而,专门用于同一目的方法的大量出现,使得科学家越来越难以决定哪种方法更适合他们的研究。本工作比较了过去十年中发表的不同上位性检测方法在运行时间、检测能力和Ⅰ型错误率方面的差异,特别强调了高阶相互作用。结果表明,就检测能力而言,唯一在所有实验中表现良好的方法是穷举方法,尽管它们在大规模研究中的计算成本可能过高。对于非穷举方法,如果不存在边际效应,则没有一种方法能够一致地找到上位性相互作用。如果存在边际效应,则存在一些适用于高阶相互作用的方法,例如 BADTrees、FDHE-IW、SingleMI 或 SNPHarvester。至于假阳性控制,只有 SNPHarvester、FDHE-IW 和 DCHE 显示出良好的结果。研究得出的结论是,在所有情况下,没有一种单一的上位性检测方法可以推荐。如果有足够的计算资源(考虑到数据集大小),作者应该优先考虑穷举方法,而在分析时间过长时,则应采用非穷举方法。