Sternberger Anne L, Bowman Megan J, Kruse Colin P S, Childs Kevin L, Ballard Harvey E, Wyatt Sarah E
Department of Environmental and Plant Biology, Ohio University, Athens, OH, United States.
Department of Plant Biology, Michigan State University, East Lansing, MI, United States.
Front Plant Sci. 2019 Feb 15;10:156. doi: 10.3389/fpls.2019.00156. eCollection 2019.
is a large genus with worldwide distribution and many traits not currently exemplified in model plants including unique breeding systems and the production of cyclotides. Here we report genome assembly and transcriptomic analyses of the non-model species using short-read DNA sequencing data and RNA-Seq from eight diverse tissues. First, genome size was estimated through flow cytometry, resulting in an approximate haploid genome of 455 Mbp. Next, the draft genome was sequenced and assembled resulting in 264,035,065 read pairs and 161,038 contigs with an N50 length of 3,455 base pairs (bp). RNA-Seq data were then assembled into tissue-specific transcripts. Together, the DNA and transcript data generated 38,081 gene models which were functionally annotated based on homology to genes and Pfam domains. Gene expression was visualized for each tissue via principal component analysis and hierarchical clustering, and gene co-expression analysis identified 20 modules of tissue-specific transcriptional networks. Some of these modules highlight genetic differences between chasmogamous and cleistogamous flowers and may provide insight into mixed breeding system. Orthologous clustering with the proteomes of and revealed 8,531 sequences unique to , including 81 novel cyclotide precursor sequences. Cyclotides are plant peptides characterized by a stable, cyclic cystine knot motif, making them strong candidates for drug scaffolding and protein engineering. Analysis of the RNA-Seq data for these cyclotide transcripts revealed diverse expression patterns both between transcripts and tissues. The diversity of these cyclotides was also highlighted in a maximum likelihood protein cladogram containing cyclotides and published cyclotide sequences from other Violaceae and Rubiaceae species. Collectively, this work provides the most comprehensive sequence resource for , offers valuable transcriptomic insight into , and will facilitate future functional genomics research in and other diverse plant groups.
是一个分布于全球的大属,具有许多目前在模式植物中未体现的特征,包括独特的繁殖系统和环肽的产生。在这里,我们报告了使用来自八个不同组织的短读长DNA测序数据和RNA测序对非模式物种进行的基因组组装和转录组分析。首先,通过流式细胞术估计基因组大小,得出单倍体基因组约为455 Mbp。接下来,对基因组草图进行测序和组装,得到264,035,065个读对和161,038个重叠群,N50长度为3455个碱基对(bp)。然后将RNA测序数据组装成组织特异性转录本。DNA和转录数据共同产生了38,081个基因模型,这些模型基于与基因和Pfam结构域的同源性进行了功能注释。通过主成分分析和层次聚类对每个组织的基因表达进行了可视化,基因共表达分析确定了20个组织特异性转录网络模块。其中一些模块突出了开花受精花和闭花受精花之间的遗传差异,并可能为混合繁殖系统提供见解。与和的蛋白质组进行直系同源聚类揭示了8531个特有的序列,包括81个新的环肽前体序列。环肽是一种植物肽,其特征是具有稳定的环状胱氨酸结基序,使其成为药物支架和蛋白质工程的有力候选物。对这些环肽转录本的RNA测序数据的分析揭示了转录本之间和组织之间不同的表达模式。这些环肽的多样性在包含环肽和来自其他堇菜科和茜草科物种的已发表环肽序列的最大似然蛋白质进化枝图中也得到了突出显示。总的来说,这项工作为提供了最全面的序列资源,并为提供了有价值的转录组见解,将促进未来对和其他不同植物类群的功能基因组学研究。