College of Animal Science and Technology, Yangzhou University, Yangzhou 225009, China.
Department of Animal Production, Faculty of Agriculture, Kafrelsheikh University, Kafrelsheikh 33516, Egypt.
Viruses. 2024 Mar 4;16(3):398. doi: 10.3390/v16030398.
The interest in endogenous retroviruses (ERVs) has been fueled by their impact on the evolution of the host genome. In this study, we used multiple pipelines to conduct a de novo exploration and annotation of ERVs in 13 species of the Caprinae subfamily. Through analyses of sequence identity, structural organization, and phylogeny, we defined 28 ERV groups within Caprinae, including 19 gamma retrovirus groups and 9 beta retrovirus groups. Notably, we identified four recent and potentially active groups prevalent in the Caprinae genomes. Additionally, our investigation revealed that most long noncoding genes (lncRNA) and protein-coding genes (PC) contain ERV-derived sequences. Specifically, we observed that ERV-derived sequences were present in approximately 75% of protein-coding genes and 81% of lncRNA genes in sheep. Similarly, in goats, ERV-derived sequences were found in approximately 74% of protein-coding genes and 75% of lncRNA genes. Our findings lead to the conclusion that the majority of ERVs in the Caprinae genomes can be categorized as fossils, representing remnants of past retroviral infections that have become permanently integrated into the genomes. Nevertheless, the identification of the Cap_ERV_20, Cap_ERV_21, Cap_ERV_24, and Cap_ERV_25 groups indicates the presence of relatively recent and potentially active ERVs in these genomes. These particular groups may contribute to the ongoing evolution of the Caprinae genome. The identification of putatively active ERVs in the Caprinae genomes raises the possibility of harnessing them for future genetic marker development.
内源性逆转录病毒(ERVs)因其对宿主基因组进化的影响而引起了人们的兴趣。在这项研究中,我们使用多种管道对 13 种绵羊科物种中的 ERV 进行了从头探索和注释。通过序列同一性、结构组织和系统发育分析,我们在绵羊科内定义了 28 个 ERV 组,包括 19 个伽马逆转录病毒组和 9 个贝塔逆转录病毒组。值得注意的是,我们鉴定了在绵羊科基因组中流行的四个最近的和潜在活跃的组。此外,我们的研究还表明,大多数长非编码基因(lncRNA)和蛋白编码基因(PC)都包含 ERV 衍生序列。具体来说,我们观察到 ERV 衍生序列存在于绵羊大约 75%的蛋白编码基因和约 81%的 lncRNA 基因中。类似地,在山羊中,ERV 衍生序列存在于大约 74%的蛋白编码基因和约 75%的 lncRNA 基因中。我们的研究结果表明,绵羊科基因组中的大多数 ERV 可以归类为化石,代表过去逆转录病毒感染的残余物,已经永久整合到基因组中。然而,Cap_ERV_20、Cap_ERV_21、Cap_ERV_24 和 Cap_ERV_25 组的鉴定表明,这些基因组中存在相对较新的和潜在活跃的 ERV。这些特定的组可能有助于绵羊科基因组的持续进化。绵羊科基因组中潜在活跃 ERV 的鉴定为未来遗传标记的开发提供了利用它们的可能性。