Forestry and Forest Products Research Institute, Forest Research and Management Organization, Ibaraki, Japan.
Forest Research Institute, Toyama Prefectural Agricultural Forestry and Fisheries Research Center, Toyama, Japan.
PLoS One. 2021 Feb 25;16(2):e0247180. doi: 10.1371/journal.pone.0247180. eCollection 2021.
Sugi (Cryptomeria japonica D. Don) is an important conifer used for afforestation in Japan. As the genome of this species is 11 Gbps, it is too large to assemble within a short timeframe. Transcriptomics is one approach that can address this deficiency. Here we designed a workflow consisting of three stages to de novo assemble transcriptome using Oases and Trinity. The three transcriptomic stage used were independent assembly, automatic and semi-manual integration, and refinement by filtering out potential contamination. We identified a set of 49,795 cDNA and an equal number of translated proteins. According to the benchmark set by BUSCO, 87.01% of cDNAs identified were complete genes, and 78.47% were complete and single-copy genes. Compared to other full-length cDNA resources collected by Sanger and PacBio sequencers, the extent of the coverage in our dataset was the highest, indicating that these data can be safely used for further studies. When two tissue-specific libraries were compared, there were significant expression differences between male strobili and leaf and bark sets. Moreover, subtle expression difference between male-fertile and sterile libraries were detected. Orthologous genes from other model plants and conifer species were identified. We demonstrated that our transcriptome assembly output (CJ3006NRE) can serve as a reference transcriptome for future functional genomics and evolutionary biology studies.
日本柳杉(Cryptomeria japonica D. Don)是日本造林的重要针叶树种。由于该物种的基因组大小为 11 Gbps,在短时间内进行组装过大。转录组学是一种可以解决这个问题的方法。在这里,我们设计了一个工作流程,使用 Oases 和 Trinity 进行从头组装转录组,包括三个阶段:独立组装、自动和半自动整合,以及通过过滤潜在污染来进行细化。我们鉴定了 49795 条 cDNA 和数量相等的翻译蛋白。根据 BUSCO 基准,鉴定的 cDNA 中 87.01%是完整基因,78.47%是完整和单拷贝基因。与其他通过 Sanger 和 PacBio 测序仪收集的全长 cDNA 资源相比,我们数据集的覆盖范围最高,表明这些数据可以安全地用于进一步的研究。当比较两个组织特异性文库时,雄球果和叶皮之间存在显著的表达差异。此外,还检测到雄性可育和不育文库之间的细微表达差异。鉴定了来自其他模式植物和针叶树种的同源基因。我们证明,我们的转录组组装结果(CJ3006NRE)可以作为未来功能基因组学和进化生物学研究的参考转录组。