Animal Genomics, ETH Zurich, Universitaetstrasse 2, 8092, Zurich, Switzerland.
Genetics. 2023 Nov 1;225(3). doi: 10.1093/genetics/iyad161.
Structural variants (SVs) and short tandem repeats (STRs) are significant sources of genetic variation. However, the impacts of these variants on gene regulation have not been investigated in cattle. Here, we genotyped and characterized 19,408 SVs and 374,821 STRs in 183 bovine genomes and investigated their impact on molecular phenotypes derived from testis transcriptomes. We found that 71% STRs were multiallelic. The vast majority (95%) of STRs and SVs were in intergenic and intronic regions. Only 37% SVs and 40% STRs were in high linkage disequilibrium (LD) (R2 > 0.8) with surrounding SNPs/insertions and deletions (Indels), indicating that SNP-based association testing and genomic prediction are blind to a nonnegligible portion of genetic variation. We showed that both SVs and STRs were more than 2-fold enriched among expression and splicing QTL (e/sQTL) relative to SNPs/Indels and were often associated with differential expression and splicing of multiple genes. Deletions and duplications had larger impacts on splicing and expression than any other type of SV. Exonic duplications predominantly increased gene expression either through alternative splicing or other mechanisms, whereas expression- and splicing-associated STRs primarily resided in intronic regions and exhibited bimodal effects on the molecular phenotypes investigated. Most e/sQTL resided within 100 kb of the affected genes or splicing junctions. We pinpoint candidate causal STRs and SVs associated with the expression of SLC13A4 and TTC7B and alternative splicing of a lncRNA and CAPP1. We provide a catalog of STRs and SVs for taurine cattle and show that these variants contribute substantially to gene expression and splicing variation.
结构变异 (SV) 和短串联重复 (STR) 是遗传变异的重要来源。然而,这些变异对基因调控的影响在牛中尚未得到研究。在这里,我们对 183 个牛基因组中的 19408 个 SV 和 374821 个 STR 进行了基因分型和特征分析,并研究了它们对睾丸转录组衍生的分子表型的影响。我们发现,71%的 STR 是多等位基因的。绝大多数 (95%) 的 STR 和 SV 位于基因间区和内含子区。只有 37%的 SV 和 40%的 STR 与周围 SNPs/插入和缺失 (Indels) 处于高度连锁不平衡 (R2>0.8) 状态,这表明基于 SNP 的关联测试和基因组预测对遗传变异的相当一部分是盲目的。我们表明,SV 和 STR 在表达和剪接 QTL (e/sQTL) 中的富集程度均高于 SNP/Indels,并且通常与多个基因的差异表达和剪接相关。缺失和重复对剪接和表达的影响大于任何其他类型的 SV。外显子重复主要通过选择性剪接或其他机制增加基因表达,而与表达和剪接相关的 STR 主要位于内含子区,并对所研究的分子表型表现出双峰效应。大多数 e/sQTL 位于受影响基因或剪接连接的 100 kb 内。我们确定了与 SLC13A4 和 TTC7B 表达以及 lncRNA 和 CAPP1 选择性剪接相关的候选因果 STR 和 SV。我们提供了一个 taurine 牛的 STR 和 SV 目录,并表明这些变异对基因表达和剪接变异有很大贡献。