Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut, United States of America.
Department of Biostatistics, Yale School of Public Health, Yale University, New Haven, Connecticut, United States of America.
PLoS Genet. 2022 Oct 17;18(10):e1010437. doi: 10.1371/journal.pgen.1010437. eCollection 2022 Oct.
Genome wide association studies (GWAS) can play an essential role in understanding genetic basis of complex traits in plants and animals. Conventional SNP-based linear mixed models (LMM) that marginally test single nucleotide polymorphisms (SNPs) have successfully identified many loci with major and minor effects in many GWAS. In plant, the relatively small population size in GWAS and the high genetic diversity found in many plant species can impede mapping efforts on complex traits. Here we present a novel haplotype-based trait fine-mapping framework, HapFM, to supplement current GWAS methods. HapFM uses genotype data to partition the genome into haplotype blocks, identifies haplotype clusters within each block, and then performs genome-wide haplotype fine-mapping to prioritize the candidate causal haplotype blocks of trait. We benchmarked HapFM, GEMMA, BSLMM, GMMAT, and BLINK in both simulated and real plant GWAS datasets. HapFM consistently resulted in higher mapping power than the other GWAS methods in high polygenicity simulation setting. Moreover, it resulted in smaller mapping intervals, especially in regions of high LD, achieved by prioritizing small candidate causal blocks in the larger haplotype blocks. In the Arabidopsis flowering time (FT10) datasets, HapFM identified four novel loci compared to GEMMA's results, and the average mapping interval of HapFM was 9.6 times smaller than that of GEMMA. In conclusion, HapFM is tailored for plant GWAS to result in high mapping power on complex traits and improved on mapping resolution to facilitate crop improvement.
全基因组关联研究(GWAS)可以在理解动植物复杂性状的遗传基础方面发挥重要作用。传统的基于单核苷酸多态性(SNP)的线性混合模型(LMM),仅对单个核苷酸多态性(SNP)进行边际测试,已成功鉴定出许多在许多 GWAS 中具有主要和次要效应的基因座。在植物中,GWAS 中的相对较小的群体规模和许多植物物种中发现的高遗传多样性可能会阻碍对复杂性状的图谱绘制工作。在这里,我们提出了一种新的基于单倍型的性状精细映射框架 HapFM,以补充当前的 GWAS 方法。HapFM 使用基因型数据将基因组划分为单倍型块,在每个块内识别单倍型簇,然后进行全基因组单倍型精细映射,以优先考虑性状的候选因果单倍型块。我们在模拟和真实的植物 GWAS 数据集上对 HapFM、GEMMA、BSLMM、GMMAT 和 BLINK 进行了基准测试。在高多基因模拟设置中,HapFM 始终比其他 GWAS 方法具有更高的映射能力。此外,它通过在较大的单倍型块中优先考虑小的候选因果块,导致较小的映射间隔,尤其是在 LD 较高的区域。在拟南芥开花时间(FT10)数据集上,与 GEMMA 的结果相比,HapFM 鉴定出了四个新的基因座,并且 HapFM 的平均映射间隔比 GEMMA 的小 9.6 倍。总之,HapFM 是为植物 GWAS 量身定制的,可提高复杂性状的映射能力,并提高映射分辨率,以促进作物改良。
Genes (Basel). 2020-11-25
Bioinformatics. 2023-8-1
Annu Rev Plant Biol. 2021-6-17
Plant Genome. 2021-3
Cell. 2020-7-9
PLoS Comput Biol. 2020-2-14
Theor Appl Genet. 2020-5