比较大豆和玉米关联定位中的不同统计模型及多重检验校正

Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize.

作者信息

Kaler Avjinder S, Gillman Jason D, Beissinger Timothy, Purcell Larry C

机构信息

Department of Crop, Soil, and Environmental Sciences, University of Arkansas, Fayetteville, AR, United States.

Plant Genetic Research Unit, USDA-ARS, Columbia, MO, United States.

出版信息

Front Plant Sci. 2020 Feb 25;10:1794. doi: 10.3389/fpls.2019.01794. eCollection 2019.

DOI:10.3389/fpls.2019.01794

PMID:32158452

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7052329/

Abstract

Association mapping (AM) is a powerful tool for fine mapping complex trait variation down to nucleotide sequences by exploiting historical recombination events. A major problem in AM is controlling false positives that can arise from population structure and family relatedness. False positives are often controlled by incorporating covariates for structure and kinship in mixed linear models (MLM). These MLM-based methods are single locus models and can introduce false negatives due to over fitting of the model. In this study, eight different statistical models, ranging from single-locus to multilocus, were compared for AM for three traits differing in heritability in two crop species: soybean ( L.) and maize ( L.). Soybean and maize were chosen, in part, due to their highly differentiated rate of linkage disequilibrium (LD) decay, which can influence false positive and false negative rates. The fixed and random model circulating probability unification (FarmCPU) performed better than other models based on an analysis of Q-Q plots and on the identification of the known number of quantitative trait loci (QTLs) in a simulated data set. These results indicate that the FarmCPU controls both false positives and false negatives. Six qualitative traits in soybean with known published genomic positions were also used to compare these models, and results indicated that the FarmCPU consistently identified a single highly significant SNP closest to these known published genes. Multiple comparison adjustments (Bonferroni, false discovery rate, and positive false discovery rate) were compared for these models using a simulated trait having 60% heritability and 20 QTLs. Multiple comparison adjustments were overly conservative for MLM, CMLM, ECMLM, and MLMM and did not find any significant markers; in contrast, ANOVA, GLM, and SUPER models found an excessive number of markers, far more than 20 QTLs. The FarmCPU model, using less conservative methods (false discovery rate, and positive false discovery rate) identified 10 QTLs, which was closer to the simulated number of QTLs than the number found by other models.

摘要

关联分析（AM）是一种强大的工具，可通过利用历史重组事件将复杂性状变异精细定位到核苷酸序列。AM中的一个主要问题是控制由群体结构和家族相关性引起的假阳性。通常通过在混合线性模型（MLM）中纳入结构和亲属关系的协变量来控制假阳性。这些基于MLM的方法是单基因座模型，由于模型过度拟合可能会引入假阴性。在本研究中，针对两种作物（大豆（Glycine max (L.) Merr.）和玉米（Zea mays L.））中遗传力不同的三个性状，比较了从单基因座到多基因座的八种不同统计模型用于AM。选择大豆和玉米部分原因是它们的连锁不平衡（LD）衰减速率高度不同，这会影响假阳性和假阴性率。基于Q-Q图分析以及在模拟数据集中对已知数量的数量性状基因座（QTL）的识别，固定和随机模型循环概率统一法（FarmCPU）比其他模型表现更好。这些结果表明FarmCPU能同时控制假阳性和假阴性。还使用了大豆中六个已知已发表基因组位置的质量性状来比较这些模型，结果表明FarmCPU始终能识别出最接近这些已知已发表基因的单个高度显著的SNP。使用具有60%遗传力和20个QTL的模拟性状，对这些模型的多重比较调整（邦费罗尼校正、错误发现率和阳性错误发现率）进行了比较。对于MLM、CMLM、ECMLM和MLMM，多重比较调整过于保守，未发现任何显著标记；相比之下，方差分析（ANOVA）、广义线性模型（GLM）和SUPER模型发现了过多的标记，远远超过20个QTL。使用不太保守方法（错误发现率和阳性错误发现率）的FarmCPU模型识别出10个QTL，比其他模型发现的数量更接近模拟的QTL数量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0750/7052329/c3921fc587b5/fpls-10-01794-g001.jpg

相似文献

Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize.比较大豆和玉米关联定位中的不同统计模型及多重检验校正

Front Plant Sci. 2020 Feb 25;10:1794. doi: 10.3389/fpls.2019.01794. eCollection 2019.

Estimation of a significance threshold for genome-wide association studies.全基因组关联研究中显著性阈值的估计。

BMC Genomics. 2019 Jul 29;20(1):618. doi: 10.1186/s12864-019-5992-7.

Iterative Usage of Fixed and Random Effect Models for Powerful and Efficient Genome-Wide Association Studies.用于强大且高效的全基因组关联研究的固定效应模型和随机效应模型的迭代使用

PLoS Genet. 2016 Feb 1;12(2):e1005767. doi: 10.1371/journal.pgen.1005767. eCollection 2016 Feb.

Single trait versus principal component based association analysis for flowering related traits in pigeonpea.基于单性状和主成分的关联分析在菘蓝开花相关性状中的应用。

Sci Rep. 2022 Jun 21;12(1):10453. doi: 10.1038/s41598-022-14568-1.

The Use of Targeted Marker Subsets to Account for Population Structure and Relatedness in Genome-Wide Association Studies of Maize (Zea mays L.).在玉米（Zea mays L.）全基因组关联研究中使用靶向标记子集来考虑群体结构和相关性

G3 (Bethesda). 2016 Aug 9;6(8):2365-74. doi: 10.1534/g3.116.029090.

Comparing performances of different statistical models and multiple threshold methods in a nested association mapping population of wheat.比较不同统计模型和多种阈值方法在小麦嵌套关联作图群体中的表现。

Front Plant Sci. 2024 Oct 1;15:1460353. doi: 10.3389/fpls.2024.1460353. eCollection 2024.

Genome-wide association studies reveal novel QTLs, QTL-by-environment interactions and their candidate genes for tocopherol content in soybean seed.全基因组关联研究揭示了大豆种子生育酚含量的新QTL、QTL与环境的互作及其候选基因。

Front Plant Sci. 2022 Oct 27;13:1026581. doi: 10.3389/fpls.2022.1026581. eCollection 2022.

Comparison of Single-Trait and Multi-Trait Genome-Wide Association Models and Inclusion of Correlated Traits in the Dissection of the Genetic Architecture of a Complex Trait in a Breeding Program.单性状与多性状全基因组关联模型的比较以及在育种计划中复杂性状遗传结构剖析中纳入相关性状

Front Plant Sci. 2022 Jan 28;12:772907. doi: 10.3389/fpls.2021.772907. eCollection 2021.

Bayesian association mapping of multiple quantitative trait loci and its application to the analysis of genetic variation among Oryza sativa L. germplasms.多个数量性状位点的贝叶斯关联作图及其在水稻种质遗传变异分析中的应用

Theor Appl Genet. 2007 May;114(8):1437-49. doi: 10.1007/s00122-007-0529-x. Epub 2007 Mar 14.

Genome Wide Single Locus Single Trait, Multi-Locus and Multi-Trait Association Mapping for Some Important Agronomic Traits in Common Wheat (T. aestivum L.).普通小麦（T. aestivum L.）一些重要农艺性状的全基因组单基因座单性状、多基因座和多性状关联图谱分析

PLoS One. 2016 Jul 21;11(7):e0159343. doi: 10.1371/journal.pone.0159343. eCollection 2016.

引用本文的文献

Identification of key candidate genes regulating hundred-grain weight in the maize inbred line Ye107 from Southern China.鉴定调控中国南方玉米自交系掖107百粒重的关键候选基因。

Plant Cell Rep. 2025 Sep 1;44(9):206. doi: 10.1007/s00299-025-03595-7.

Discovery of major QTL and a massive haplotype associated with cannabinoid biosynthesis in drug-type Cannabis.在药用型大麻中发现与大麻素生物合成相关的主要数量性状基因座和一个大型单倍型。

Plant Genome. 2025 Jun;18(2):e70031. doi: 10.1002/tpg2.70031.

Multi-Experiment and Multi-Locus Genome-Wide Association Mapping for Grain Arsenic in Rice Population.水稻群体籽粒砷含量的多实验和多位点全基因组关联图谱分析

Plant Direct. 2025 May 4;9(5):e70064. doi: 10.1002/pld3.70064. eCollection 2025 May.

Identification of novel candidate genes for regulating oil composition in soybean seeds under environmental stresses.环境胁迫下大豆种子油成分调控新候选基因的鉴定

Front Plant Sci. 2025 Apr 17;16:1572319. doi: 10.3389/fpls.2025.1572319. eCollection 2025.

Genomic prediction and QTL analysis for grain Zn content and yield in -derived rice populations.衍生水稻群体中籽粒锌含量和产量的基因组预测与QTL分析。

J Plant Biochem Biotechnol. 2024;33(2):216-236. doi: 10.1007/s13562-024-00886-0. Epub 2024 May 9.

GWAS analysis revealed genomic loci and candidate genes associated with the 100-seed weight in high-latitude-adapted soybean germplasm.全基因组关联研究（GWAS）分析揭示了与高纬度适应型大豆种质百粒重相关的基因组位点和候选基因。

Theor Appl Genet. 2025 Jan 12;138(1):29. doi: 10.1007/s00122-024-04815-6.

Genome-Wide Association Analysis of Boar Semen Traits Based on Computer-Assisted Semen Analysis and Flow Cytometry.基于计算机辅助精液分析和流式细胞术的公猪精液性状全基因组关联分析

Animals (Basel). 2024 Dec 26;15(1):26. doi: 10.3390/ani15010026.

Genome-Wide Association Study for Resistance to in Soybean [ (L.) Merr.].大豆[（L.）Merr.]对[具体病害未给出]抗性的全基因组关联研究

Plants (Basel). 2024 Dec 15;13(24):3501. doi: 10.3390/plants13243501.

Unlocking genetic diversity for low-input systems in a changing climate through participatory characterization and GWAS of lentil landraces.通过参与式鉴定和对小扁豆地方品种的全基因组关联研究，在气候变化背景下为低投入系统解锁遗传多样性。

Sci Rep. 2024 Dec 30;14(1):31979. doi: 10.1038/s41598-024-83516-y.

Genome-wide association mapping in safflower (Carthamus tinctorius L.) for genetic dissection of drought tolerance using DArTseq markers.利用DArTseq标记对红花（Carthamus tinctorius L.）进行全基因组关联作图以剖析耐旱性的遗传机制。

Sci Rep. 2024 Dec 28;14(1):31490. doi: 10.1038/s41598-024-82932-4.

本文引用的文献

Estimation of a significance threshold for genome-wide association studies.全基因组关联研究中显著性阈值的估计。

BMC Genomics. 2019 Jul 29;20(1):618. doi: 10.1186/s12864-019-5992-7.

Genome-Wide Association Studies and Comparison of Models and Cross-Validation Strategies for Genomic Prediction of Quality Traits in Advanced Winter Wheat Breeding Lines.冬小麦高级育种系品质性状基因组预测的全基因组关联研究及模型与交叉验证策略比较

Front Plant Sci. 2018 Feb 2;9:69. doi: 10.3389/fpls.2018.00069. eCollection 2018.

The genetic architecture of amylose biosynthesis in maize kernel.玉米胚乳直链淀粉合成的遗传结构。

Plant Biotechnol J. 2018 Feb;16(2):688-695. doi: 10.1111/pbi.12821. Epub 2017 Sep 15.

Genome-wide association mapping of canopy wilting in diverse soybean genotypes.不同大豆基因型冠层萎蔫的全基因组关联图谱分析

Theor Appl Genet. 2017 Oct;130(10):2203-2217. doi: 10.1007/s00122-017-2951-z. Epub 2017 Jul 20.

Methodological implementation of mixed linear models in multi-locus genome-wide association studies.多基因座全基因组关联研究中混合线性模型的方法学实施。

Brief Bioinform. 2018 Jul 20;19(4):700-712. doi: 10.1093/bib/bbw145.

Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies.用于多位点全基因组关联研究的迭代确定独立筛选EM-贝叶斯套索算法

PLoS Comput Biol. 2017 Jan 31;13(1):e1005357. doi: 10.1371/journal.pcbi.1005357. eCollection 2017 Jan.

PLoS Genet. 2016 Feb 1;12(2):e1005767. doi: 10.1371/journal.pgen.1005767. eCollection 2016 Feb.

Improving power and accuracy of genome-wide association studies via a multi-locus mixed linear model methodology.通过多基因座混合线性模型方法提高全基因组关联研究的效能和准确性。

Sci Rep. 2016 Jan 20;6:19444. doi: 10.1038/srep19444.

LinkImpute: Fast and Accurate Genotype Imputation for Nonmodel Organisms.LinkImpute：非模式生物的快速准确基因型填充

G3 (Bethesda). 2015 Sep 15;5(11):2383-90. doi: 10.1534/g3.115.021667.

Genomic consequences of selection and genome-wide association mapping in soybean.大豆中选择的基因组后果及全基因组关联定位

BMC Genomics. 2015 Sep 3;16(1):671. doi: 10.1186/s12864-015-1872-y.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

比较大豆和玉米关联定位中的不同统计模型及多重检验校正

Comparing Different Statistical Models and Multiple Testing Corrections for Association Mapping in Soybean and Maize.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献