Evangelistella Chiara, Valentini Alessio, Ludovisi Riccardo, Firrincieli Andrea, Fabbrini Francesco, Scalabrin Simone, Cattonaro Federica, Morgante Michele, Mugnozza Giuseppe Scarascia, Keurentjes Joost J B, Harfouche Antoine
Department for Innovation in Biological, Agro-food and Forest Systems, University of Tuscia, Via S. Camillo de Lellis snc, 01100 Viterbo, Italy.
Alasia Franco Vivai s.s., Strada Solerette, 5/A, 12038 Savigliano, Italy.
Biotechnol Biofuels. 2017 May 30;10:138. doi: 10.1186/s13068-017-0828-7. eCollection 2017.
has attracted renewed interest as a potential candidate energy crop for use in biomass-to-liquid fuel conversion processes and biorefineries. This is due to its high productivity, adaptability to marginal land conditions, and suitability for biofuel and biomaterial production. Despite its importance, the genomic resources currently available for supporting the improvement of this species are still limited.
We used RNA sequencing (RNA-Seq) to de novo assemble and characterize the leaf transcriptome. The sequencing generated 1249 million clean reads that were assembled using single-- and multi-- approaches into 62,596 unique sequences (unitranscripts) with an N50 of 1134 bp. TransDecoder and Trinotate software suites were used to obtain putative coding sequences and annotate them by mapping to UniProtKB/Swiss-Prot and UniRef90 databases, searching for known transcripts, proteins, protein domains, and signal peptides. Furthermore, the unitranscripts were annotated by mapping them to the NCBI non-redundant, GO and KEGG pathway databases using Blast2GO. The transcriptome was also characterized by BLAST searches to investigate homologous transcripts of key genes involved in important metabolic pathways, such as lignin, cellulose, purine, and thiamine biosynthesis and carbon fixation. Moreover, a set of homologous transcripts of key genes involved in stomatal development and of genes coding for stress-associated proteins (SAPs) were identified. Additionally, 8364 simple sequence repeat (SSR) markers were identified and surveyed. SSRs appeared more abundant in non-coding regions (63.18%) than in coding regions (36.82%). This SSR dataset represents the first marker catalogue of . 53 SSRs (PolySSRs) were then predicted to be polymorphic between ecotype-specific assemblies, suggesting genetic variability in the studied ecotypes.
This study provides the first publicly available leaf transcriptome for the bioenergy crop. The functional annotation and characterization of the transcriptome will be highly useful for providing insight into the molecular mechanisms underlying its extreme adaptability. The identification of homologous transcripts involved in key metabolic pathways offers a platform for directing future efforts in genetic improvement of this species. Finally, the identified SSRs will facilitate the harnessing of untapped genetic diversity. This transcriptome should be of value to ongoing functional genomics and genetic studies in this crop of paramount economic importance.
作为一种潜在的候选能源作物,可用于生物质转化为液体燃料的过程和生物精炼厂,已重新引起人们的关注。这是由于其高产、对边际土地条件的适应性以及适合生物燃料和生物材料生产。尽管其很重要,但目前可用于支持该物种改良的基因组资源仍然有限。
我们使用RNA测序(RNA-Seq)对叶片转录组进行从头组装和表征。测序产生了12.49亿条clean reads,使用单端和多端方法将其组装成62,596个独特序列(单转录本),N50为1134 bp。使用TransDecoder和Trinotate软件套件获得推定的编码序列,并通过映射到UniProtKB/Swiss-Prot和UniRef90数据库、搜索已知转录本、蛋白质、蛋白质结构域和信号肽来对其进行注释。此外,通过使用Blast2GO将单转录本映射到NCBI非冗余、GO和KEGG通路数据库来对转录组进行注释。还通过BLAST搜索对转录组进行表征,以研究参与重要代谢途径(如木质素、纤维素、嘌呤和硫胺素生物合成以及碳固定)的关键基因的同源转录本。此外,还鉴定了一组参与气孔发育的关键基因的同源转录本以及编码胁迫相关蛋白(SAPs)的基因。此外,还鉴定并检测了8364个简单序列重复(SSR)标记。SSR在非编码区(63.18%)比在编码区(36.82%)更为丰富。这个SSR数据集代表了[物种名称]的第一个标记目录。然后预测53个SSR(多SSR)在生态型特异性组装之间是多态的,这表明在所研究的生态型中存在遗传变异性。
本研究为这种生物能源作物提供了首个公开可用的叶片转录组。转录组的功能注释和表征对于深入了解其极端适应性的分子机制将非常有用。参与关键代谢途径的同源转录本的鉴定为指导该物种未来的遗传改良工作提供了一个平台。最后,鉴定出的SSR将有助于利用未开发的遗传多样性。这个转录组对于这种具有至关重要经济意义的作物正在进行的功能基因组学和遗传学研究应该具有价值。