AgriBio, Centre for AgriBioscience, Agriculture Victoria, Melbourne, VIC, Australia.
School of Applied Systems Biology, La Trobe University, Melbourne, VIC, Australia.
BMC Genomics. 2018 May 24;19(1):395. doi: 10.1186/s12864-018-4800-0.
Topological association domains (TADs) are chromosomal domains characterised by frequent internal DNA-DNA interactions. The transcription factor CTCF binds to conserved DNA sequence patterns called CTCF binding motifs to either prohibit or facilitate chromosomal interactions. TADs and CTCF binding motifs control gene expression, but they are not yet well defined in the bovine genome. In this paper, we sought to improve the annotation of bovine TADs and CTCF binding motifs, and assess whether the new annotation can reduce the search space for cis-regulatory variants.
We used genomic synteny to map TADs and CTCF binding motifs from humans, mice, dogs and macaques to the bovine genome. We found that our mapped TADs exhibited the same hallmark properties of those sourced from experimental data, such as housekeeping genes, transfer RNA genes, CTCF binding motifs, short interspersed elements, H3K4me3 and H3K27ac. We showed that runs of genes with the same pattern of allele-specific expression (ASE) (either favouring paternal or maternal allele) were often located in the same TAD or between the same conserved CTCF binding motifs. Analyses of variance showed that when averaged across all bovine tissues tested, TADs explained 14% of ASE variation (standard deviation, SD: 0.056), while CTCF explained 27% (SD: 0.078). Furthermore, we showed that the quantitative trait loci (QTLs) associated with gene expression variation (eQTLs) or ASE variation (aseQTLs), which were identified from mRNA transcripts from 141 lactating cows' white blood and milk cells, were highly enriched at putative bovine CTCF binding motifs. The linearly-furthermost, and most-significant aseQTL and eQTL for each genic target were located within the same TAD as the gene more often than expected (Chi-Squared test P-value < 0.001).
Our results suggest that genomic synteny can be used to functionally annotate conserved transcriptional components, and provides a tool to reduce the search space for causative regulatory variants in the bovine genome.
拓扑关联域(TAD)是染色体域,其特征在于频繁的内部 DNA-DNA 相互作用。转录因子 CTCF 结合到称为 CTCF 结合基序的保守 DNA 序列模式上,以阻止或促进染色体相互作用。TAD 和 CTCF 结合基序控制基因表达,但它们在牛基因组中尚未得到很好的定义。在本文中,我们试图改进牛 TAD 和 CTCF 结合基序的注释,并评估新注释是否可以缩小顺式调控变异的搜索空间。
我们使用基因组同线性将来自人类、小鼠、狗和猕猴的 TAD 和 CTCF 结合基序映射到牛基因组上。我们发现,我们映射的 TAD 表现出与实验数据来源的 TAD 相同的标志性特性,例如管家基因、转移 RNA 基因、CTCF 结合基序、短散布元件、H3K4me3 和 H3K27ac。我们表明,具有相同等位基因特异性表达(ASE)模式的基因(偏向父本或母本等位基因)的连续排列通常位于同一 TAD 或同一保守 CTCF 结合基序之间。方差分析表明,当平均应用于所有测试的牛组织时,TAD 解释了 14%的 ASE 变异(标准差,SD:0.056),而 CTCF 解释了 27%(SD:0.078)。此外,我们表明,从 141 头泌乳奶牛的白细胞和乳汁细胞的 mRNA 转录本中鉴定出的与基因表达变异(eQTL)或 ASE 变异(aseQTL)相关的数量性状基因座(QTL)高度富集在假定的牛 CTCF 结合基序中。每个基因靶标的线性最远和最显著的 aseQTL 和 eQTL 位于与基因相同的 TAD 内的频率高于预期(卡方检验 P 值<0.001)。
我们的结果表明,基因组同线性可用于功能注释保守的转录成分,并提供了一种工具来缩小牛基因组中致病调节变异的搜索空间。