Laboratory of Evolutionary Genetics, Institute of Biology, University of Neuchâtel, Neuchâtel, Neuchâtel 2000, Switzerland.
Department of Plant Pathology, University of Wisconsin-Madison, Madison, WI 53706, USA.
Nucleic Acids Res. 2024 Jun 10;52(10):5496-5513. doi: 10.1093/nar/gkae327.
Cargo-mobilizing mobile elements (CMEs) are genetic entities that faithfully transpose diverse protein coding sequences. Although common in bacteria, we know little about eukaryotic CMEs because no appropriate tools exist for their annotation. For example, Starships are giant fungal CMEs whose functions are largely unknown because they require time-intensive manual curation. To address this knowledge gap, we developed starfish, a computational workflow for high-throughput eukaryotic CME annotation. We applied starfish to 2 899 genomes of 1 649 fungal species and found that starfish recovers known Starships with 95% combined precision and recall while expanding the number of annotated elements ten-fold. Extant Starship diversity is partitioned into 11 families that differ in their enrichment patterns across fungal classes. Starship cargo changes rapidly such that elements from the same family differ substantially in their functional repertoires, which are predicted to contribute to diverse biological processes such as metabolism. Many elements have convergently evolved to insert into 5S rDNA and AT-rich sequence while others integrate into random locations, revealing both specialist and generalist strategies for persistence. Our work establishes a framework for advancing mobile element biology and provides the means to investigate an emerging dimension of eukaryotic genetic diversity, that of genomes within genomes.
货载移动元件 (CMEs) 是忠实转座不同蛋白质编码序列的遗传实体。虽然在细菌中很常见,但我们对真核 CMEs 知之甚少,因为缺乏适当的工具来对其进行注释。例如,Starships 是巨型真菌 CMEs,由于它们需要耗时的人工管理,因此其功能在很大程度上是未知的。为了解决这一知识空白,我们开发了 starfish,这是一种用于高通量真核 CME 注释的计算工作流程。我们将 starfish 应用于 1649 种真菌的 2899 个基因组中,发现 starfish 以 95%的综合精度和召回率恢复了已知的 Starships,同时将注释元素的数量增加了十倍。现存的 Starship 多样性分为 11 个家族,它们在真菌类群中的富集模式不同。Starship 货物变化迅速,因此来自同一家族的元素在功能谱上有很大差异,这些功能谱被预测有助于代谢等多种生物过程。许多元素已经趋同进化到插入 5S rDNA 和富含 AT 的序列,而其他元素则整合到随机位置,这揭示了维持存在的专门化和通用化策略。我们的工作为推进移动元件生物学建立了一个框架,并提供了调查真核遗传多样性新兴维度的手段,即基因组内的基因组。