Paulownia Research and Development Center of State Administration of Forestry and Grassland, Zhengzhou, 450003, China.
Non-Timber Forestry Research and Development Center, Chinese Academy of Forestry, Zhengzhou, 450003, China.
Sci Rep. 2021 Apr 22;11(1):8734. doi: 10.1038/s41598-021-87538-8.
Paulownia catalpifolia is an important, fast-growing timber species known for its high density, color and texture. However, few transcriptomic and genetic studies have been conducted in P. catalpifolia. In this study, single-molecule real-time sequencing technology was applied to obtain the full-length transcriptome of P. catalpifolia leaves treated with varying degrees of drought stress. The sequencing data were then used to search for microsatellites, or simple sequence repeats (SSRs). A total of 28.83 Gb data were generated, 25,969 high-quality (HQ) transcripts with an average length of 1624 bp were acquired after removing the redundant reads, and 25,602 HQ transcripts (98.59%) were annotated using public databases. Among the HQ transcripts, 16,722 intact coding sequences, 149 long non-coding RNAs and 179 alternative splicing events were predicted, respectively. A total of 7367 SSR loci were distributed throughout 6293 HQ transcripts, of which 763 complex SSRs and 6604 complete SSRs. The SSR appearance frequency was 28.37%, and the average distribution distance was 5.59 kb. Among the 6604 complete SSR loci, 1-3 nucleotide repeats were dominant, occupying 97.85% of the total SSR loci, of which mono-, di- and tri-nucleotide repeats were 44.68%, 33.86% and 19.31%, respectively. We detected 112 repeat motifs, of which A/T (42.64%), AG/CT (12.22%), GA/TC (9.63%), GAA/TTC (1.57%) and CCA/TGG (1.54%) were most common in mono-, di- and tri-nucleotide repeats, respectively. The length of the repeat SSR motifs was 10-88 bp, and 4997 (75.67%) were ≤ 20 bp. This study provides a novel full-length transcriptome reference for P. catalpifolia and will facilitate the identification of germplasm resources and breeding of new drought-resistant P. catalpifolia varieties.
泡桐是一种重要的速生用材树种,以其高密度、颜色和质地而闻名。然而,对泡桐的转录组和遗传研究较少。本研究应用单分子实时测序技术,获得了不同干旱胁迫程度下泡桐叶片的全长转录组。然后,利用测序数据搜索微卫星或简单序列重复(SSR)。共产生了 28.83Gb 的数据,在去除冗余reads 后,获得了 25969 条高质量(HQ)转录本,平均长度为 1624bp,25602 条 HQ 转录本(98.59%)被公共数据库注释。在 HQ 转录本中,分别预测了 16722 个完整编码序列、149 个长非编码 RNA 和 179 个可变剪接事件。共检测到 7367 个 SSR 位点分布在 6293 个 HQ 转录本中,其中 763 个复杂 SSR 和 6604 个完全 SSR。SSR 出现频率为 28.37%,平均分布距离为 5.59kb。在 6604 个完全 SSR 位点中,1-3 个核苷酸重复占主导地位,占总 SSR 位点的 97.85%,其中单核苷酸、二核苷酸和三核苷酸重复分别占 44.68%、33.86%和 19.31%。共检测到 112 个重复基序,其中 A/T(42.64%)、AG/CT(12.22%)、GA/TC(9.63%)、GAA/TTC(1.57%)和 CCA/TGG(1.54%)在单核苷酸、二核苷酸和三核苷酸重复中最为常见。重复 SSR 基序的长度为 10-88bp,其中 4997(75.67%) ≤ 20bp。本研究为泡桐提供了一个新的全长转录组参考,将有助于泡桐种质资源的鉴定和新的耐旱泡桐品种的选育。