Suppr超能文献

构建用于柑橘基因组注释的重复序列综合文库。

Construction of a comprehensive library of repeated sequences for the annotation of Citrus genomes.

作者信息

Giraud Delphine, Choisne Nathalie, Summo Marilyne, Sidibe-Bocs Stéphanie, Vassilieff Héléna, Costantino Gilles, Droc Gaetan, Teycheney Pierre-Yves, Maumus Florian, Ollitrault Patrick, Luro François

机构信息

UR AGAP Corse, INRAE, Institut Agro, CIRAD, University of Montpellier, San Giuliano, F-20230, France.

URGI, INRAE, Université Paris-Saclay, Versailles, F-78026, France.

出版信息

BMC Genom Data. 2025 Apr 18;26(1):30. doi: 10.1186/s12863-025-01321-6.

Abstract

BACKGROUND

The comprehensive annotation of repeated sequences in genomes is an essential prerequisite for studying the dynamics of these sequences over time and their involvement in gene regulation. Currently, the diversity of repeated sequences in Citrus genomes is only partially characterized because the annotations have been performed using heterogeneous bioinformatics tools, each with its specificity and dedicated only to the annotation of transposable elements.

RESULTS

We combined complementary repeat-finding programs including REPET, CAULIFINDER, and TAREAN, to enable the identification of all types of repetitive sequences found in plant genomes, including transposable elements, endogenous caulimovirids, and satellite DNAs. A fine-grained annotation method was first developed to create a consensus sequence library of repeated sequences identified in the genome assemblies of C. medica, C. micrantha, C. reticulata, and C. maxima, the four ancestral parental species involved in the formation of economically valuable cultivated Citrus varieties. A second, faster annotation method was developed to enrich the dataset by adding new repeated sequences retrieved from genome assemblies of other Citrus species and closely related species belonging to the Aurantioideae subfamily. The final reference library contains 3,091 consensus sequences, of which 94.5% are transposable elements. The diversity of endogenous caulimovirids was characterized for the first time within the genus Citrus, contributing 160 consensus sequences to the final reference library. Finally, 10 satellite DNAs were also identified.

CONCLUSION

Combining multiple repeat detection methods enables the comprehensive annotation of all repeated sequences in Citrus genomes. Using the final reference library reported in this work will improve our understanding of the dynamics of repeated sequences during Citrus speciation, particularly following the genome duplication and hybridization events that led to modern cultivars. The exploration of repeat position insertions along chromosomes using the developed web interface, RepeatLoc Citrus, will also make it possible to further investigate the role of transposable elements and endogenous caulimovirids in genome structure and gene regulation in Citrus species.

摘要

背景

基因组中重复序列的全面注释是研究这些序列随时间的动态变化及其在基因调控中作用的重要前提。目前,柑橘属基因组中重复序列的多样性仅得到部分表征,因为注释是使用异质生物信息学工具进行的,每种工具都有其特异性,且仅专注于转座元件的注释。

结果

我们结合了包括REPET、CAULIFINDER和TAREAN在内的互补重复序列查找程序,以识别植物基因组中发现的所有类型的重复序列,包括转座元件、内源性花椰菜花叶病毒和卫星DNA。首先开发了一种细粒度注释方法,以创建在枸橼、小花橙、柑橘和柚这四个参与形成具有经济价值的栽培柑橘品种的祖先亲本物种的基因组组装中鉴定出的重复序列的共有序列库。开发了第二种更快的注释方法,通过添加从其他柑橘物种和属于金橘亚科的近缘物种的基因组组装中检索到的新重复序列来丰富数据集。最终的参考库包含3091个共有序列,其中94.5%是转座元件。首次在柑橘属内表征了内源性花椰菜花叶病毒的多样性,为最终参考库贡献了160个共有序列。最后,还鉴定出10个卫星DNA。

结论

结合多种重复序列检测方法能够对柑橘属基因组中的所有重复序列进行全面注释。使用本研究报告的最终参考库将增进我们对柑橘物种形成过程中重复序列动态变化的理解,特别是在导致现代品种的基因组加倍和杂交事件之后。使用开发的网络界面RepeatLoc Citrus探索沿染色体重复序列位置插入情况,也将有可能进一步研究转座元件和内源性花椰菜花叶病毒在柑橘属物种基因组结构和基因调控中的作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bb8c/12007355/ecd29d8c1902/12863_2025_1321_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验