The Davies Research Centre, School of Animal and Veterinary Sciences, The University of Adelaide, Roseworthy SA 5371, Australia.
Robinson Research Institute,, The University of Adelaide, North Adelaide SA 5006, Australia.
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae061.
Most DNA methylation studies have used a single reference genome with little attention paid to the bias introduced due to the reference chosen. Reference genome artifacts and genetic variation, including single nucleotide polymorphisms (SNPs) and structural variants (SVs), can lead to differences in methylation sites (CpGs) between individuals of the same species. We analyzed whole-genome bisulfite sequencing data from the fetal liver of Angus (Bos taurus taurus), Brahman (Bos taurus indicus), and reciprocally crossed samples. Using reference genomes for each breed from the Bovine Pangenome Consortium, we investigated the influence of reference genome choice on the breed and parent-of-origin effects in methylome analyses.
Our findings revealed that ∼75% of CpG sites were shared between Angus and Brahman, ∼5% were breed specific, and ∼20% were unresolved. We demonstrated up to ∼2% quantification bias in global methylation when an incorrect reference genome was used. Furthermore, we found that SNPs impacted CpGs 13 times more than other autosomal sites (P < $5 \times {10}^{ - 324}$) and SVs contained 1.18 times (P < $5 \times {10}^{ - 324}$) more CpGs than non-SVs. We found a poor overlap between differentially methylated regions (DMRs) and differentially expressed genes (DEGs) and suggest that DMRs may be impacting enhancers that target these DEGs. DMRs overlapped with imprinted genes, of which 1, DGAT1, which is important for fat metabolism and weight gain, was found in the breed-specific and sire-of-origin comparisons.
This work demonstrates the need to consider reference genome effects to explore genetic and epigenetic differences accurately and identify DMRs involved in controlling certain genes.
大多数 DNA 甲基化研究都使用单一的参考基因组,而很少关注由于所选参考基因组而引入的偏差。参考基因组的伪影和遗传变异,包括单核苷酸多态性(SNPs)和结构变异(SVs),会导致同一物种个体之间的甲基化位点(CpG)存在差异。我们分析了 Angus(Bos taurus taurus)、Brahman(Bos taurus indicus)胎儿肝脏的全基因组亚硫酸氢盐测序数据,以及正反交样本。使用牛泛基因组联盟(Bovine Pangenome Consortium)为每个品种提供的参考基因组,我们研究了参考基因组选择对甲基组分析中品种和亲本来源效应的影响。
我们的研究结果表明, Angus 和 Brahman 之间约有 75%的 CpG 位点是共享的,约有 5%是品种特异性的,约 20%是未解决的。我们发现,当使用错误的参考基因组时,全球甲基化的定量偏差高达约 2%。此外,我们发现 SNPs 对 CpG 的影响是其他常染色体位点的 13 倍(P < $5 \times {10}^{ - 324}$),SVs 比非-SVs 多包含 1.18 倍(P < $5 \times {10}^{ - 324}$)的 CpG。我们发现差异甲基化区域(DMRs)和差异表达基因(DEGs)之间的重叠较差,并提出 DMRs 可能影响针对这些 DEGs 的增强子。DMRs 与印迹基因重叠,其中 1 个,DGAT1,对脂肪代谢和体重增加很重要,在品种特异性和父本来源比较中都有发现。
这项工作表明,有必要考虑参考基因组的影响,以准确地探索遗传和表观遗传差异,并识别参与控制某些基因的 DMRs。