Suppr超能文献

用于多位点全基因组关联研究的迭代确定独立筛选EM-贝叶斯套索算法

Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies.

作者信息

Tamba Cox Lwaka, Ni Yuan-Li, Zhang Yuan-Ming

机构信息

State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China.

Department of Mathematics, Egerton University, Egerton, Kenya.

出版信息

PLoS Comput Biol. 2017 Jan 31;13(1):e1005357. doi: 10.1371/journal.pcbi.1005357. eCollection 2017 Jan.

Abstract

Genome-wide association study (GWAS) entails examining a large number of single nucleotide polymorphisms (SNPs) in a limited sample with hundreds of individuals, implying a variable selection problem in the high dimensional dataset. Although many single-locus GWAS approaches under polygenic background and population structure controls have been widely used, some significant loci fail to be detected. In this study, we used an iterative modified-sure independence screening (ISIS) approach in reducing the number of SNPs to a moderate size. Expectation-Maximization (EM)-Bayesian least absolute shrinkage and selection operator (BLASSO) was used to estimate all the selected SNP effects for true quantitative trait nucleotide (QTN) detection. This method is referred to as ISIS EM-BLASSO algorithm. Monte Carlo simulation studies validated the new method, which has the highest empirical power in QTN detection and the highest accuracy in QTN effect estimation, and it is the fastest, as compared with efficient mixed-model association (EMMA), smoothly clipped absolute deviation (SCAD), fixed and random model circulating probability unification (FarmCPU), and multi-locus random-SNP-effect mixed linear model (mrMLM). To further demonstrate the new method, six flowering time traits in Arabidopsis thaliana were re-analyzed by four methods (New method, EMMA, FarmCPU, and mrMLM). As a result, the new method identified most previously reported genes. Therefore, the new method is a good alternative for multi-locus GWAS.

摘要

全基因组关联研究(GWAS)需要在一个包含数百个体的有限样本中检测大量单核苷酸多态性(SNP),这意味着在高维数据集中存在变量选择问题。尽管许多在多基因背景和群体结构控制下的单基因座GWAS方法已被广泛使用,但仍有一些显著位点未被检测到。在本研究中,我们使用了一种迭代的改良确定性独立筛选(ISIS)方法,将SNP数量减少到适中规模。期望最大化(EM)-贝叶斯最小绝对收缩和选择算子(BLASSO)用于估计所有选定SNP对真实数量性状核苷酸(QTN)检测的效应。该方法被称为ISIS EM-BLASSO算法。蒙特卡罗模拟研究验证了该新方法,与高效混合模型关联(EMMA)、平滑截断绝对偏差(SCAD)、固定和随机模型循环概率统一(FarmCPU)以及多位点随机SNP效应混合线性模型(mrMLM)相比,它在QTN检测中具有最高的经验功效,在QTN效应估计中具有最高的准确性,并且是最快的。为了进一步证明该新方法,我们用四种方法(新方法、EMMA、FarmCPU和mrMLM)对拟南芥的六个开花时间性状进行了重新分析。结果,新方法鉴定出了大多数先前报道的基因。因此,新方法是多位点GWAS的一个很好的替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53af/5308866/e70bd0bd7dd0/pcbi.1005357.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验