Suppr超能文献

比较大豆和玉米关联定位中的不同统计模型及多重检验校正

Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize.

作者信息

Kaler Avjinder S, Gillman Jason D, Beissinger Timothy, Purcell Larry C

机构信息

Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States.

Plant Genetic Research Unit, USDA-ARS, Columbia, MO, United States.

出版信息

Front Plant Sci. 2020 Feb 25;10:1794. doi: 10.3389/fpls.2019.01794. eCollection 2019.

Abstract

Association mapping (AM) is a powerful tool for fine mapping complex trait variation down to nucleotide sequences by exploiting historical recombination events. A major problem in AM is controlling false positives that can arise from population structure and family relatedness. False positives are often controlled by incorporating covariates for structure and kinship in mixed linear models (MLM). These MLM-based methods are single locus models and can introduce false negatives due to over fitting of the model. In this study, eight different statistical models, ranging from single-locus to multilocus, were compared for AM for three traits differing in heritability in two crop species: soybean ( L.) and maize ( L.). Soybean and maize were chosen, in part, due to their highly differentiated rate of linkage disequilibrium (LD) decay, which can influence false positive and false negative rates. The fixed and random model circulating probability unification (FarmCPU) performed better than other models based on an analysis of Q-Q plots and on the identification of the known number of quantitative trait loci (QTLs) in a simulated data set. These results indicate that the FarmCPU controls both false positives and false negatives. Six qualitative traits in soybean with known published genomic positions were also used to compare these models, and results indicated that the FarmCPU consistently identified a single highly significant SNP closest to these known published genes. Multiple comparison adjustments (Bonferroni, false discovery rate, and positive false discovery rate) were compared for these models using a simulated trait having 60% heritability and 20 QTLs. Multiple comparison adjustments were overly conservative for MLM, CMLM, ECMLM, and MLMM and did not find any significant markers; in contrast, ANOVA, GLM, and SUPER models found an excessive number of markers, far more than 20 QTLs. The FarmCPU model, using less conservative methods (false discovery rate, and positive false discovery rate) identified 10 QTLs, which was closer to the simulated number of QTLs than the number found by other models.

摘要

关联分析(AM)是一种强大的工具,可通过利用历史重组事件将复杂性状变异精细定位到核苷酸序列。AM中的一个主要问题是控制由群体结构和家族相关性引起的假阳性。通常通过在混合线性模型(MLM)中纳入结构和亲属关系的协变量来控制假阳性。这些基于MLM的方法是单基因座模型,由于模型过度拟合可能会引入假阴性。在本研究中,针对两种作物(大豆(Glycine max (L.) Merr.)和玉米(Zea mays L.))中遗传力不同的三个性状,比较了从单基因座到多基因座的八种不同统计模型用于AM。选择大豆和玉米部分原因是它们的连锁不平衡(LD)衰减速率高度不同,这会影响假阳性和假阴性率。基于Q-Q图分析以及在模拟数据集中对已知数量的数量性状基因座(QTL)的识别,固定和随机模型循环概率统一法(FarmCPU)比其他模型表现更好。这些结果表明FarmCPU能同时控制假阳性和假阴性。还使用了大豆中六个已知已发表基因组位置的质量性状来比较这些模型,结果表明FarmCPU始终能识别出最接近这些已知已发表基因的单个高度显著的SNP。使用具有60%遗传力和20个QTL的模拟性状,对这些模型的多重比较调整(邦费罗尼校正、错误发现率和阳性错误发现率)进行了比较。对于MLM、CMLM、ECMLM和MLMM,多重比较调整过于保守,未发现任何显著标记;相比之下,方差分析(ANOVA)、广义线性模型(GLM)和SUPER模型发现了过多的标记,远远超过20个QTL。使用不太保守方法(错误发现率和阳性错误发现率)的FarmCPU模型识别出10个QTL,比其他模型发现的数量更接近模拟的QTL数量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0750/7052329/c3921fc587b5/fpls-10-01794-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验