Suppr超能文献

菊花南农基因组中重复 DNA 类别的性质和基因组景观揭示了近期的基因组变化。

The nature and genomic landscape of repetitive DNA classes in Chrysanthemum nankingense shows recent genomic changes.

机构信息

Key Laboratory of Landscaping, Ministry of Agriculture and Rural Affairs, Key Laboratory of Biology of Ornamental Plants in East China, National Forestry and Grassland Administration, College of Horticulture, Nanjing Agricultural University, Nanjing 210095, China.

Institute of Botany, Jiangsu Province and Chinese Academy of Sciences (Nanjing Botanical Garden Mem. Sun Yat-Sen), Nanjing, 210014, China.

出版信息

Ann Bot. 2023 Feb 7;131(1):215-228. doi: 10.1093/aob/mcac066.

Abstract

BACKGROUND AND AIMS

Tandemly repeated DNA and transposable elements represent most of the DNA in higher plant genomes. High-throughput sequencing allows a survey of the DNA in a genome, but whole-genome assembly can miss a substantial fraction of highly repeated sequence motifs. Chrysanthemum nankingense (2n = 2x = 18; genome size = 3.07 Gb; Asteraceae), a diploid reference for the many auto- and allopolyploids in the genus, was considered as an ancestral species and serves as an ornamental plant and high-value food. We aimed to characterize the major repetitive DNA motifs, understand their structure and identify key features that are shaped by genome and sequence evolution.

METHODS

Graph-based clustering with RepeatExplorer was used to identify and classify repetitive motifs in 2.14 millions of 250-bp paired-end Illumina reads from total genomic DNA of C. nankingense. Independently, the frequency of all canonical motifs k-bases long was counted in the raw read data and abundant k-mers (16, 21, 32, 64 and 128) were extracted and assembled to generate longer contigs for repetitive motif identification. For comparison, long terminal repeat retrotransposons were checked in the published C. nankingense reference genome. Fluorescent in situ hybridization was performed to show the chromosomal distribution of the main types of repetitive motifs.

KEY RESULTS

Apart from rDNA (0.86 % of the total genome), a few microsatellites (0.16 %), and telomeric sequences, no highly abundant tandem repeats were identified. There were many transposable elements: 40 % of the genome had sequences with recognizable domains related to transposable elements. Long terminal repeat retrotransposons showed widespread distribution over chromosomes, although different sequence families had characteristic features such as abundance at or exclusion from centromeric or subtelomeric regions. Another group of very abundant repetitive motifs, including those most identified as low-complexity sequences (9.07 %) in the genome, showed no similarity to known sequence motifs or tandemly repeated elements.

CONCLUSIONS

The Chrysanthemum genome has an unusual structure with a very low proportion of tandemly repeated sequences (~1.02 %) in the genome, and a high proportion of low-complexity sequences, most likely degenerated remains of transposable elements. Identifying the presence, nature and genomic organization of major genome fractions enables inference of the evolutionary history of sequences, including degeneration and loss, critical to understanding biodiversity and diversification processes in the genomes of diploid and polyploid Chrysanthemum, Asteraceae and plants more widely.

摘要

背景和目的

串联重复 DNA 和转座元件代表了高等植物基因组中的大部分 DNA。高通量测序允许对基因组中的 DNA 进行调查,但全基因组组装可能会错过大量高度重复的序列基序。南京菊花(2n=2x=18;基因组大小=3.07Gb;菊科)是该属许多自交和异源多倍体的二倍体参考种,被认为是一个祖先物种,同时也是一种观赏植物和高价值食品。我们旨在描述主要的重复 DNA 基序,了解它们的结构,并确定由基因组和序列进化塑造的关键特征。

方法

使用基于图的聚类RepeatExplorer 来识别和分类来自南京菊花总基因组 DNA 的 214 万个 250-bp 配对末端 Illumina 读取中的重复基序。独立地,在原始读取数据中统计所有长度为 k 个碱基的规则基序的频率,并提取丰富的 k-mer(16、21、32、64 和 128)并将其组装以生成重复基序识别的更长的连续体。为了进行比较,在已发表的南京菊花参考基因组中检查了长末端重复反转录转座子。通过荧光原位杂交(FISH)显示主要重复基序类型的染色体分布。

主要结果

除了 rDNA(占基因组的 0.86%)、一些微卫星(0.16%)和端粒序列外,没有发现高度丰富的串联重复。有许多转座元件:基因组的 40%具有与转座元件相关的可识别结构域的序列。长末端重复反转录转座子在染色体上广泛分布,尽管不同的序列家族具有特征,例如在着丝粒或亚端粒区域的丰富或排除。另一组非常丰富的重复基序,包括在基因组中最被识别为低复杂度序列(9.07%)的那些,与已知的序列基序或串联重复元件没有相似性。

结论

菊花基因组的结构很不寻常,基因组中串联重复序列的比例非常低(约 1.02%),而低复杂度序列的比例很高,这些序列很可能是转座元件退化的残余物。鉴定主要基因组部分的存在、性质和基因组组织,使我们能够推断序列的进化历史,包括退化和丢失,这对于理解二倍体和多倍体菊花、菊科和更广泛的植物基因组中的生物多样性和多样化过程至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1bc3/9904347/621ad5e197c8/mcac066_fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验