Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark.
BMC Genet. 2019 Jan 29;20(1):15. doi: 10.1186/s12863-019-0717-0.
Genome-wide association studies (GWAS) have been successfully implemented in cattle research and breeding. However, moving from the associations to identify the causal variants and reveal underlying mechanisms have proven complicated. In dairy cattle populations, we face a challenge due to long-range linkage disequilibrium (LD) arising from close familial relationships in the studied individuals. Long range LD makes it difficult to distinguish if one or multiple quantitative trait loci (QTL) are segregating in a genomic region showing association with a phenotype. We had two objectives in this study: 1) to distinguish between multiple QTL segregating in a genomic region, and 2) use of external information to prioritize candidate genes for a QTL along with the candidate variants.
We observed fixing the lead SNP as a covariate can help to distinguish additional close association signal(s). Thereafter, using the mammalian phenotype database, we successfully found candidate genes, in concordance with previous studies, demonstrating the power of this strategy. Secondly, we used variant annotation information to search for causative variants in our candidate genes. The variant information successfully identified known causal mutations and showed the potential to pinpoint the causative mutation(s) which are located in coding regions.
Our approach can distinguish multiple QTL segregating on the same chromosome in a single analysis without manual input. Moreover, utilizing information from the mammalian phenotype database and variant effect predictor as post-GWAS analysis could benefit in candidate genes and causative mutations finding in cattle. Our study not only identified additional candidate genes for milk traits, but also can serve as a routine method for GWAS in dairy cattle.
全基因组关联研究(GWAS)已成功应用于牛的研究和育种中。然而,从关联中确定因果变异并揭示潜在机制已被证明是复杂的。在奶牛群体中,由于研究个体之间的近亲关系导致长程连锁不平衡(LD),我们面临着一个挑战。长程 LD 使得难以区分在与表型相关的基因组区域中是否存在一个或多个数量性状基因座(QTL)分离。在本研究中,我们有两个目标:1)区分在基因组区域中分离的多个 QTL,2)利用外部信息优先考虑与 QTL 相关的候选基因及其候选变异。
我们观察到固定先导 SNP 作为协变量可以帮助区分额外的密切关联信号。此后,我们使用哺乳动物表型数据库,成功找到了候选基因,与先前的研究一致,证明了这种策略的有效性。其次,我们使用变异注释信息在候选基因中搜索致病变异。变异信息成功识别了已知的致病突变,并显示了在编码区域中确定致病突变的潜力。
我们的方法可以在单次分析中区分同一染色体上分离的多个 QTL,而无需手动输入。此外,在 GWAS 后分析中利用哺乳动物表型数据库和变异效应预测器的信息,有助于找到候选基因和致病突变。我们的研究不仅鉴定了牛奶性状的其他候选基因,还可以作为奶牛 GWAS 的常规方法。