Key Laboratory of Qinghai-Tibetan Plateau Animal Genetic Resource Reservation and Utilization, Sichuan Province and Ministry of Education, Southwest Minzu University, Chengdu, Sichuan, China.
Jilin Agricultural Science and Technology University, Jilin, Jilin, China.
Methods Mol Biol. 2022;2481:199-217. doi: 10.1007/978-1-0716-2237-7_13.
Genome-wide association study (GWAS) is based on the linkage disequilibrium (LD) between phenotypes and genetic markers covering the whole genome. Besides the genetic linkage between the genetic markers and the causal mutations, many other factors contribute to the LD, including selection and nonrandom mating formatting population structure. Many methods have been developed with accompany of corresponding software such as multiple loci mixed model (MLMM). There are software packages that implement multiple methods to reduce the learning curve. One of them is the Genomic Association and Prediction Integrated Tool (GAPIT), which implemented eight models including GLM (General Linear Model), Mixed Linear Model (MLM), Compressed MLM, MLMM, SUPER (Settlement of mixed linear models Under Progressively Exclusive Relationship), FarmCPU (Fixed and random model Circulating Probability Unification), and BLINK (Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway). Besides the availability of multiple models, GAPIT provides comprehensive functions for data quality control, data visualization, and publication-ready quality graphic outputs, such as Manhattan plots in rectangle and circle formats, quantile-quantile (QQ) plots, principal component plots, scatter plot of minor allele frequency against GWAS signals, plots of LD between associated markers and the adjacent markers. GAPIT developers and users established a community through the GAPIT forum ( https://groups.google.com/g/gapit-forum ) with over 600 members for asking questions, making comments, and sharing experiences. In this chapter, we detail the GAPIT functions, input data frame, output files, and example codes for each GWAS model. We also interpret parameters, functional algorithms, and modules of GAPIT implementation.
全基因组关联研究(GWAS)基于覆盖整个基因组的表型和遗传标记之间的连锁不平衡(LD)。除了遗传标记与因果突变之间的遗传连锁外,许多其他因素也会导致 LD,包括选择和非随机交配形成的群体结构。许多方法已经开发出来,并伴随着相应的软件,如多基因混合模型(MLMM)。有一些软件包实现了多种方法来降低学习曲线。其中之一是基因组关联和预测综合工具(GAPIT),它实现了包括 GLM(广义线性模型)、混合线性模型(MLM)、压缩 MLM、MLMM、SUPER(混合线性模型的解决方案在逐步排他关系下)、FarmCPU(固定和随机模型循环概率统一)和 BLINK(贝叶斯信息和连锁不平衡迭代嵌套键)在内的 8 种模型。除了提供多种模型的可用性外,GAPIT 还提供了用于数据质量控制、数据可视化和出版准备质量图形输出的综合功能,例如矩形和圆形格式的曼哈顿图、分位数-分位数(QQ)图、主成分图、次要等位基因频率与 GWAS 信号的散点图、关联标记与相邻标记之间的 LD 图。GAPIT 的开发人员和用户通过 GAPIT 论坛(https://groups.google.com/g/gapit-forum)建立了一个拥有超过 600 名成员的社区,用于提问、发表评论和分享经验。在本章中,我们详细介绍了 GAPIT 的功能、输入数据框、输出文件和每个 GWAS 模型的示例代码。我们还解释了 GAPIT 实现的参数、功能算法和模块。