Sanchez Marie-Pierre, Govignon-Gion Armelle, Croiseau Pascal, Fritz Sébastien, Hozé Chris, Miranda Guy, Martin Patrice, Barbat-Leterrier Anne, Letaïef Rabia, Rocha Dominique, Brochard Mickaël, Boussaha Mekki, Boichard Didier
GABI, INRA, AgroParisTech, Université Paris Saclay, 78350, Jouy-en-Josas, France.
Institut de l'Elevage, 75012, Paris, France.
Genet Sel Evol. 2017 Sep 18;49(1):68. doi: 10.1186/s12711-017-0344-z.
Genome-wide association studies (GWAS) were performed at the sequence level to identify candidate mutations that affect the expression of six major milk proteins in Montbéliarde (MON), Normande (NOR), and Holstein (HOL) dairy cattle. Whey protein (α-lactalbumin and β-lactoglobulin) and casein (αs1, αs2, β, and κ) contents were estimated by mid-infrared (MIR) spectrometry, with medium to high accuracy (0.59 ≤ R ≤ 0.92), for 848,068 test-day milk samples from 156,660 cows in the first three lactations. Milk composition was evaluated as average test-day measurements adjusted for environmental effects. Next, we genotyped a subset of 8080 cows (2967 MON, 2737 NOR, and 2306 HOL) with the BovineSNP50 Beadchip. For each breed, genotypes were first imputed to high-density (HD) using HD single nucleotide polymorphisms (SNPs) genotypes of 522 MON, 546 NOR, and 776 HOL bulls. The resulting HD SNP genotypes were subsequently imputed to the sequence level using 27 million high-quality sequence variants selected from Run4 of the 1000 Bull Genomes consortium (1147 bulls). Within-breed, multi-breed, and conditional GWAS were performed.
Thirty-four distinct genomic regions were identified. Three regions on chromosomes 6, 11, and 20 had very significant effects on milk composition and were shared across the three breeds. Other significant effects, which partially overlapped across breeds, were found on almost all the autosomes. Multi-breed analyses provided a larger number of significant genomic regions with smaller confidence intervals than within-breed analyses. Combinations of within-breed, multi-breed, and conditional analyses led to the identification of putative causative variants in several candidate genes that presented significant protein-protein interactions enrichment, including those with previously described effects on milk composition (SLC37A1, MGST1, ABCG2, CSN1S1, CSN2, CSN1S2, CSN3, PAEP, DGAT1, AGPAT6) and those with effects reported for the first time here (ALPL, ANKH, PICALM).
GWAS applied to fine-scale phenotypes, multiple breeds, and whole-genome sequences seems to be effective to identify candidate gene variants. However, although we identified functional links between some candidate genes and milk phenotypes, the causality between candidate variants and milk protein composition remains to be demonstrated. Nevertheless, the identification of potential causative mutations that underlie milk protein composition may have immediate applications for improvements in cheese-making.
在序列水平上进行全基因组关联研究(GWAS),以鉴定影响蒙贝利亚尔牛(MON)、诺曼底牛(NOR)和荷斯坦牛(HOL)这三个奶牛品种中六种主要乳蛋白表达的候选突变。通过中红外(MIR)光谱法对156,660头处于头三个泌乳期的奶牛的848,068份测定日产奶样品中的乳清蛋白(α-乳白蛋白和β-乳球蛋白)和酪蛋白(αs1、αs2、β和κ)含量进行了估计,其准确性从中等到较高(0.59≤R≤0.92)。牛奶成分被评估为针对环境影响进行调整后的测定日平均测量值。接下来,我们使用牛SNP50芯片对8080头奶牛的一个子集(2967头MON、2737头NOR和2306头HOL)进行了基因分型。对于每个品种,首先使用522头MON、546头NOR和776头HOL公牛的高密度(HD)单核苷酸多态性(SNP)基因型将基因型推算至高密度水平。随后,使用从1000头公牛基因组联盟的Run4中选择的2700万个高质量序列变异(1147头公牛)将所得的HD SNP基因型推算至序列水平。进行了品种内、多品种和条件性GWAS分析。
鉴定出34个不同的基因组区域。6号、11号和20号染色体上的三个区域对牛奶成分有非常显著的影响,并且在这三个品种中都存在。在几乎所有常染色体上都发现了其他显著影响,这些影响在不同品种之间部分重叠。与品种内分析相比,多品种分析提供了更多具有更小置信区间的显著基因组区域。品种内、多品种和条件性分析的组合导致在几个呈现显著蛋白质-蛋白质相互作用富集的候选基因中鉴定出推定的致病变异,包括那些先前已描述对牛奶成分有影响的基因(SLC37A1、MGST1、ABCG2、CSN1S1、CSN2、CSN1S2、CSN3、PAEP、DGAT1、AGPAT6)以及在此首次报道有影响的基因(ALPL、ANKH、PICALM)。
应用于精细规模表型、多个品种和全基因组序列的GWAS似乎对于鉴定候选基因变异是有效的。然而,尽管我们确定了一些候选基因与牛奶表型之间的功能联系,但候选变异与乳蛋白成分之间的因果关系仍有待证明。尽管如此,鉴定出构成乳蛋白成分基础的潜在致病突变可能对奶酪制作的改进有直接应用价值。