Suppr超能文献

一种用于高维全基因组关联研究的快速多位点岭回归算法。

A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies.

作者信息

Zhang Jin, Chen Min, Wen Yangjun, Zhang Yin, Lu Yunan, Wang Shengmeng, Chen Juncong

机构信息

College of Science, Nanjing Agricultural University, Nanjing, China.

Postdoctoral Research Station of Crop Science, Nanjing Agricultural University, Nanjing, China.

出版信息

Front Genet. 2021 Mar 29;12:649196. doi: 10.3389/fgene.2021.649196. eCollection 2021.

Abstract

The mixed linear model (MLM) has been widely used in genome-wide association study (GWAS) to dissect quantitative traits in human, animal, and plant genetics. Most methodologies consider all single nucleotide polymorphism (SNP) effects as random effects under the MLM framework, which fail to detect the joint minor effect of multiple genetic markers on a trait. Therefore, polygenes with minor effects remain largely unexplored in today's big data era. In this study, we developed a new algorithm under the MLM framework, which is called the fast multi-locus ridge regression (FastRR) algorithm. The FastRR algorithm first whitens the covariance matrix of the polygenic matrix K and environmental noise, then selects potentially related SNPs among large scale markers, which have a high correlation with the target trait, and finally analyzes the subset variables using a multi-locus deshrinking ridge regression for true quantitative trait nucleotide (QTN) detection. Results from the analyses of both simulated and real data show that the FastRR algorithm is more powerful for both large and small QTN detection, more accurate in QTN effect estimation, and has more stable results under various polygenic backgrounds. Moreover, compared with existing methods, the FastRR algorithm has the advantage of high computing speed. In conclusion, the FastRR algorithm provides an alternative algorithm for multi-locus GWAS in high dimensional genomic datasets.

摘要

混合线性模型(MLM)已广泛应用于全基因组关联研究(GWAS),以剖析人类、动物和植物遗传学中的数量性状。大多数方法在MLM框架下将所有单核苷酸多态性(SNP)效应视为随机效应,这无法检测多个遗传标记对性状的联合微效。因此,在当今的大数据时代,微效多基因在很大程度上仍未得到充分探索。在本研究中,我们在MLM框架下开发了一种新算法,称为快速多位点岭回归(FastRR)算法。FastRR算法首先对多基因矩阵K和环境噪声的协方差矩阵进行白化,然后在大规模标记中选择与目标性状高度相关的潜在相关SNP,最后使用多位点去收缩岭回归分析子集变量以进行真正的数量性状核苷酸(QTN)检测。模拟数据和真实数据分析结果表明,FastRR算法在检测大、小QTN方面更强大,在QTN效应估计方面更准确,并且在各种多基因背景下具有更稳定的结果。此外,与现有方法相比,FastRR算法具有计算速度快的优势。总之,FastRR算法为高维基因组数据集中的多位点GWAS提供了一种替代算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a7f6/8041068/134d7dc610f7/fgene-12-649196-g001.jpg

相似文献

1
A Fast Multi-Locus Ridge Regression Algorithm for High-Dimensional Genome-Wide Association Studies.
Front Genet. 2021 Mar 29;12:649196. doi: 10.3389/fgene.2021.649196. eCollection 2021.
2
Methodological implementation of mixed linear models in multi-locus genome-wide association studies.
Brief Bioinform. 2018 Jul 20;19(4):700-712. doi: 10.1093/bib/bbw145.
4
Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies.
PLoS Comput Biol. 2017 Jan 31;13(1):e1005357. doi: 10.1371/journal.pcbi.1005357. eCollection 2017 Jan.
6
pLARmEB: integration of least angle regression with empirical Bayes for multilocus genome-wide association studies.
Heredity (Edinb). 2017 Jun;118(6):517-524. doi: 10.1038/hdy.2017.8. Epub 2017 Mar 15.
7
An Efficient Score Test Integrated with Empirical Bayes for Genome-Wide Association Studies.
Front Genet. 2021 Oct 1;12:742752. doi: 10.3389/fgene.2021.742752. eCollection 2021.

引用本文的文献

本文引用的文献

1
Deshrinking ridge regression for genome-wide association studies.
Bioinformatics. 2020 Aug 15;36(14):4154-4162. doi: 10.1093/bioinformatics/btaa345.
3
An efficient multi-locus mixed model framework for the detection of small and linked QTLs in F2.
Brief Bioinform. 2019 Sep 27;20(5):1913-1924. doi: 10.1093/bib/bby058.
4
Metabolome-wide association studies for agronomic traits of rice.
Heredity (Edinb). 2018 Apr;120(4):342-355. doi: 10.1038/s41437-017-0032-3. Epub 2017 Dec 11.
5
pLARmEB: integration of least angle regression with empirical Bayes for multilocus genome-wide association studies.
Heredity (Edinb). 2017 Jun;118(6):517-524. doi: 10.1038/hdy.2017.8. Epub 2017 Mar 15.
6
Methodological implementation of mixed linear models in multi-locus genome-wide association studies.
Brief Bioinform. 2018 Jul 20;19(4):700-712. doi: 10.1093/bib/bbw145.
7
Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies.
PLoS Comput Biol. 2017 Jan 31;13(1):e1005357. doi: 10.1371/journal.pcbi.1005357. eCollection 2017 Jan.
8
A multiple-phenotype imputation method for genetic studies.
Nat Genet. 2016 Apr;48(4):466-72. doi: 10.1038/ng.3513. Epub 2016 Feb 22.
9
Simultaneous discovery, estimation and prediction analysis of complex traits using a bayesian mixture model.
PLoS Genet. 2015 Apr 7;11(4):e1004969. doi: 10.1371/journal.pgen.1004969. eCollection 2015 Apr.
10
Enrichment of statistical power for genome-wide association studies.
BMC Biol. 2014 Oct 17;12:73. doi: 10.1186/s12915-014-0073-5.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验