Zhang Jin, Chen Min, Wen Yangjun, Zhang Yin, Lu Yunan, Wang Shengmeng, Chen Juncong
College of Science, Nanjing Agricultural University, Nanjing, China.
Postdoctoral Research Station of Crop Science, Nanjing Agricultural University, Nanjing, China.
Front Genet. 2021 Mar 29;12:649196. doi: 10.3389/fgene.2021.649196. eCollection 2021.
The mixed linear model (MLM) has been widely used in genome-wide association study (GWAS) to dissect quantitative traits in human, animal, and plant genetics. Most methodologies consider all single nucleotide polymorphism (SNP) effects as random effects under the MLM framework, which fail to detect the joint minor effect of multiple genetic markers on a trait. Therefore, polygenes with minor effects remain largely unexplored in today's big data era. In this study, we developed a new algorithm under the MLM framework, which is called the fast multi-locus ridge regression (FastRR) algorithm. The FastRR algorithm first whitens the covariance matrix of the polygenic matrix K and environmental noise, then selects potentially related SNPs among large scale markers, which have a high correlation with the target trait, and finally analyzes the subset variables using a multi-locus deshrinking ridge regression for true quantitative trait nucleotide (QTN) detection. Results from the analyses of both simulated and real data show that the FastRR algorithm is more powerful for both large and small QTN detection, more accurate in QTN effect estimation, and has more stable results under various polygenic backgrounds. Moreover, compared with existing methods, the FastRR algorithm has the advantage of high computing speed. In conclusion, the FastRR algorithm provides an alternative algorithm for multi-locus GWAS in high dimensional genomic datasets.
混合线性模型(MLM)已广泛应用于全基因组关联研究(GWAS),以剖析人类、动物和植物遗传学中的数量性状。大多数方法在MLM框架下将所有单核苷酸多态性(SNP)效应视为随机效应,这无法检测多个遗传标记对性状的联合微效。因此,在当今的大数据时代,微效多基因在很大程度上仍未得到充分探索。在本研究中,我们在MLM框架下开发了一种新算法,称为快速多位点岭回归(FastRR)算法。FastRR算法首先对多基因矩阵K和环境噪声的协方差矩阵进行白化,然后在大规模标记中选择与目标性状高度相关的潜在相关SNP,最后使用多位点去收缩岭回归分析子集变量以进行真正的数量性状核苷酸(QTN)检测。模拟数据和真实数据分析结果表明,FastRR算法在检测大、小QTN方面更强大,在QTN效应估计方面更准确,并且在各种多基因背景下具有更稳定的结果。此外,与现有方法相比,FastRR算法具有计算速度快的优势。总之,FastRR算法为高维基因组数据集中的多位点GWAS提供了一种替代算法。