Animal Breeding and Genetics Program, IRTA, Torre Marimon, Caldes de Montbui (08140), Spain.
GABI, Université Paris-Saclay, INRAE, AgroParisTech, Jouy-en-Josas (78350), France.
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad042. Epub 2023 Jun 24.
In humans and livestock species, genome-wide association studies (GWAS) have been applied to study the association between variants distributed across the genome and a phenotype of interest. To discover genetic polymorphisms affecting the duodenum, liver, and muscle transcriptomes of 300 pigs from 3 different breeds (Duroc, Landrace, and Large White), we performed expression GWAS between 25,315,878 polymorphisms and the expression of 13,891 genes in duodenum, 12,748 genes in liver, and 11,617 genes in muscle.
More than 9.68 × 1011 association tests were performed, yielding 14,096,080 significantly associated variants, which were grouped in 26,414 expression quantitative trait locus (eQTL) regions. Over 56% of the variants were within 1 Mb of their associated gene. In addition to the 100-kb region upstream of the transcription start site, we identified the importance of the 100-kb region downstream of the 3'UTR for gene regulation, as most of the cis-regulatory variants were located within these 2 regions. We also observed 39,874 hotspot regulatory polymorphisms associated with the expression of 10 or more genes that could modify the protein structure or the expression of a regulator gene. In addition, 2 motifs (5'-GATCCNGYGTTGCYG-3' and a poly(A) sequence) were enriched across the 3 tissues within the neighboring sequences of the most significant single-nucleotide polymorphisms in each cis-eQTL region.
The 14 million significant associations obtained in this study are publicly available and have enabled the identification of expression-associated cis-, trans-, and hotspot regulatory variants within and across tissues, thus shedding light on the molecular mechanisms of regulatory variations that shape end-trait phenotypes.
在人类和家畜物种中,全基因组关联研究(GWAS)已被应用于研究分布在整个基因组中的变异与感兴趣的表型之间的关联。为了发现影响 3 个不同品种(杜洛克、长白和大约克夏)300 头猪的十二指肠、肝脏和肌肉转录组的遗传多态性,我们在十二指肠的 13891 个基因、肝脏的 12748 个基因和肌肉的 11617 个基因之间进行了表达 GWAS,比较了 25315878 个多态性和 13891 个基因的表达。
进行了超过 9.68×1011 次关联测试,产生了 14096080 个显著相关的变异,这些变异被分为 26414 个表达数量性状基因座(eQTL)区域。超过 56%的变异位于其相关基因的 1Mb 内。除了转录起始位点上游的 100kb 区域外,我们还确定了 3'UTR 下游的 100kb 区域对基因调控的重要性,因为大多数顺式调控变异位于这两个区域内。我们还观察到 39874 个热点调控多态性与 10 个或更多基因的表达相关,这些基因可能改变蛋白质结构或调节基因的表达。此外,在每个顺式-eQTL 区域的最显著单核苷酸多态性的邻近序列中,富集了 2 个基序(5'-GATCCNGYGTTGCYG-3'和多聚(A)序列)。
本研究获得的 1400 万个显著关联是公开可用的,这些关联确定了组织内和组织间的表达相关顺式、反式和热点调节变异,从而揭示了调节变异塑造终表型的分子机制。