Guo Hongping, Yu Zuguo, An Jiyuan, Han Guosheng, Ma Yuanlin, Tang Runbin
Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education and Hunan Key Laboratory for Computation and Simulation in Science and Engineering, Xiangtan University, Xiangtan 411105, China.
School of Mathematics and Computer Science, Hanjiang Normal University, Shiyan 442000, China.
Entropy (Basel). 2020 Mar 13;22(3):329. doi: 10.3390/e22030329.
Genome-wide association study (GWAS) has turned out to be an essential technology for exploring the genetic mechanism of complex traits. To reduce the complexity of computation, it is well accepted to remove unrelated single nucleotide polymorphisms (SNPs) before GWAS, e.g., by using iterative sure independence screening expectation-maximization Bayesian Lasso (ISIS EM-BLASSO) method. In this work, a modified version of ISIS EM-BLASSO is proposed, which reduces the number of SNPs by a screening methodology based on Pearson correlation and mutual information, then estimates the effects via EM-Bayesian Lasso (EM-BLASSO), and finally detects the true quantitative trait nucleotides (QTNs) through likelihood ratio test. We call our method a two-stage mutual information based Bayesian Lasso (MBLASSO). Under three simulation scenarios, MBLASSO improves the statistical power and retains the higher effect estimation accuracy when comparing with three other algorithms. Moreover, MBLASSO performs best on model fitting, the accuracy of detected associations is the highest, and 21 genes can only be detected by MBLASSO in datasets.
全基因组关联研究(GWAS)已成为探索复杂性状遗传机制的一项重要技术。为降低计算复杂度,在GWAS之前去除不相关的单核苷酸多态性(SNP)已被广泛接受,例如,通过使用迭代确定独立筛选期望最大化贝叶斯套索(ISIS EM-BLASSO)方法。在这项工作中,提出了一种ISIS EM-BLASSO的改进版本,该版本通过基于皮尔逊相关性和互信息的筛选方法减少SNP的数量,然后通过EM-贝叶斯套索(EM-BLASSO)估计效应,最后通过似然比检验检测真正的数量性状核苷酸(QTN)。我们将我们的方法称为基于两阶段互信息的贝叶斯套索(MBLASSO)。在三种模拟场景下,与其他三种算法相比,MBLASSO提高了统计功效并保持了较高的效应估计准确性。此外,MBLASSO在模型拟合方面表现最佳,检测到的关联准确性最高,并且在数据集中只有MBLASSO能检测到21个基因。