Hassan Nozhat T, Van Treeck Briana, Rodríguez-Vargas Anthony, Sheppard Anna E, Collins Kathleen, Adelson David L
School of Biological Sciences, University of Adelaide, Adelaide, Australia.
Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, CA, USA.
bioRxiv. 2025 May 9:2025.05.05.652312. doi: 10.1101/2025.05.05.652312.
Retrotransposons play outsized roles in the evolution of gene regulation, genome function, and disease pathogenesis and more recently, have sparked interest as workhorses for new gene therapy approaches. R2 retrotransposons insert site-specifically to the multicopy genes encoding 28S ribosomal RNA at a target sequence conserved broadly across eukaryotic evolution. R2 retrotransposons have been detected in many animals, but previous surveys have been limited in scope and methodology. Here, we substantially expand the known distribution of R2 retrotransposons from previously unrepresented or underrepresented taxonomic groups ranging from ctenophores to amphibians and reptiles. We discover diverse R2 domain architectures and motifs and identify many new avian R2 candidates for genome engineering development. Overall, phylogenetic analyses reveal two highly successful R2 lineages. We observe properties of each lineage in several features of the domains that mediate DNA recognition and in co-evolving signatures within the reverse transcriptase domain. Within each of the two lineages, R2 protein sequences do not necessarily preserve the unifying configuration of N-terminal DNA-binding domains implied in the current clade classification scheme. We show that recombinant R2 proteins with distinctive domain architectures and distribution across major animal classes support target-primed reverse transcription with conserved site specificity. Our analysis of the surprisingly varied domain architectures that support target-site specificity informs new R2 classification criteria and provides a greatly expanded foundation for additional structure/function insights about DNA binding selectivity. This expanded perspective on R2 evolution informs approaches for engineering therapeutic gene insertion technologies and offers an opportunity to investigate the conservation and diversification of retrotransposons.
逆转录转座子在基因调控、基因组功能和疾病发病机制的进化中发挥着重要作用,最近,作为新基因治疗方法的主力工具,它们引发了人们的兴趣。R2逆转录转座子特异性地插入到编码28S核糖体RNA的多拷贝基因中,其靶序列在真核生物进化过程中广泛保守。R2逆转录转座子已在许多动物中被检测到,但以前的调查在范围和方法上都有限。在这里,我们大幅扩展了R2逆转录转座子的已知分布范围,涵盖了从栉水母到两栖动物和爬行动物等以前未被代表或代表不足的分类群。我们发现了多样的R2结构域架构和基序,并鉴定出许多用于基因组工程开发的新鸟类R2候选序列。总体而言,系统发育分析揭示了两个非常成功的R2谱系。我们在介导DNA识别的结构域的几个特征以及逆转录酶结构域内的共同进化特征中观察到了每个谱系的特性。在这两个谱系中的每一个中,R2蛋白序列不一定保留当前进化枝分类方案中所暗示的N端DNA结合结构域的统一构型。我们表明,具有独特结构域架构且分布于主要动物类群的重组R2蛋白支持具有保守位点特异性的靶引物逆转录。我们对支持靶位点特异性的惊人多样的结构域架构的分析为新的R2分类标准提供了依据,并为深入了解DNA结合选择性提供了极大扩展的结构/功能基础。这种对R2进化的扩展观点为工程化治疗性基因插入技术提供了思路,并为研究逆转录转座子的保守性和多样性提供了机会。