Holland James B, Piepho Hans-Peter
USDA-ARS Plant Science Research Unit and Department of Crop and Soil Sciences and NC Plant Science Initiative, North Carolina State University, Raleigh, NC 27606, USA.
Biostatistics Unit, Institute of Crop Science, University of Hohenheim, Stuttgart 70599, Germany.
G3 (Bethesda). 2024 Nov 19;14(12). doi: 10.1093/g3journal/jkae250.
Large, complex data sets can be difficult to model in a single comprehensive genome-wide association study (GWAS). The best practice for two-stage analyses is to consider lines as fixed effects in the first stage statistical model. Best linear unbiased estimates of lines can then be used as input phenotypes to the second stage analysis. In the second stage, lines can be modeled as random effects with genomic relationships to adjust for population structure when estimating individual SNP effects in GWAS.
大型复杂数据集在单一全面的全基因组关联研究(GWAS)中可能难以建模。两阶段分析的最佳做法是在第一阶段统计模型中将品系视为固定效应。然后,品系的最佳线性无偏估计值可作为第二阶段分析的输入表型。在第二阶段,在GWAS中估计单个SNP效应时,品系可以建模为具有基因组关系的随机效应,以调整群体结构。