Zhou Meng, Smith Andrew D
Molecular and Computational Biology Section, Division of Biological Sciences, University of Southern California, Los Angeles, USA.
Mob DNA. 2019 Apr 8;10:14. doi: 10.1186/s13100-019-0156-5. eCollection 2019.
L1Md retrotransposons are the most abundant and active transposable elements in the mouse genome. The promoters of many L1Md retrotransposons are composed of tandem repeats called monomers. The number of monomers varies between retrotransposon copies, thus making it difficult to annotate L1Md promoters. Duplication of monomers contributes to the maintenance of L1Md promoters during truncation-prone retrotranspositions, but the associated mechanism remains unclear. Since the current classification of monomers is based on limited data, a comprehensive monomer annotation is needed for supporting functional studies of L1Md promoters genome-wide.
We developed a pipeline for monomer detection and classification. Identified monomers are further classified into subtypes based on their sequence profiles. We applied this pipeline to genome assemblies of various rodent species. A major monomer subtype of the lab mouse was also found in other species, implying that such subtype has emerged in the common ancestor of involved species. We also characterized the positioning pattern of monomer subtypes within individual promoters. Our analyses indicate that the subtype composition of an L1Md promoter can be used to infer its transcriptional activity during male germ cell development.
We identified subtypes for all monomer types using comprehensive data, greatly expanding the spectrum of monomer variants. The analysis of monomer subtype positioning provides evidence supporting both previously proposed models of L1Md promoter expansion. The transcription silencing of L1Md promoters differs between promoter types, which supports a model involving distinct suppressive pathways rather than a universal mechanism for retrotransposon repression in gametogenesis.
L1Md逆转座子是小鼠基因组中数量最多且最活跃的转座元件。许多L1Md逆转座子的启动子由称为单体的串联重复序列组成。逆转座子拷贝之间的单体数量各不相同,因此难以对L1Md启动子进行注释。单体的复制有助于在易发生截短的逆转座过程中维持L1Md启动子,但相关机制仍不清楚。由于目前单体的分类基于有限的数据,因此需要进行全面的单体注释以支持全基因组范围内L1Md启动子的功能研究。
我们开发了一个用于单体检测和分类的流程。根据其序列特征,将鉴定出的单体进一步分类为亚型。我们将此流程应用于各种啮齿动物物种的基因组组装。在其他物种中也发现了实验室小鼠的一种主要单体亚型,这意味着这种亚型在相关物种的共同祖先中已经出现。我们还表征了各个启动子内单体亚型的定位模式。我们的分析表明,L1Md启动子的亚型组成可用于推断其在雄性生殖细胞发育过程中的转录活性。
我们使用全面的数据鉴定了所有单体类型的亚型,极大地扩展了单体变体的范围。单体亚型定位分析为先前提出的L1Md启动子扩展模型提供了支持证据。L1Md启动子的转录沉默在启动子类型之间存在差异,这支持了一种涉及不同抑制途径的模型,而不是配子发生中逆转座子抑制的普遍机制。