Ma Xuelian, Zhao Hansheng, Yan Hengyu, Sheng Minghao, Cao Yaxin, Yang Kebin, Xu Hao, Xu Wenying, Gao Zhimin, Su Zhen
State Key Laboratory of Plant Physiology and Biochemistry, College of Biological Sciences, China Agricultural University, Beijing 100193, China.
Key Laboratory of National Forestry and Grassland Administration/Beijing for Bamboo & Rattan Science and Technology, Institute of Gene Science and Industrialization for Bamboo and Rattan Resources, International Center for Bamboo and Rattan, Beijing 100102, China.
Comput Struct Biotechnol J. 2021 Apr 30;19:2708-2718. doi: 10.1016/j.csbj.2021.04.068. eCollection 2021.
Bamboo, one of the most crucial nontimber forest resources worldwide, has the capacity for rapid growth. In recent years, the genome of moso bamboo () has been decoded, and a large amount of transcriptome data has been published. In this study, we generated the genome-wide profiles of the histone modification H3K4me3 in leaf, stem, and root tissues of bamboo. The trends in the distribution patterns were similar to those in rice. We developed a processing pipeline for predicting novel transcripts to refine the structural annotation of the genome using H3K4me3 ChIP-seq data and 29 RNA-seq datasets. As a result, 12,460 novel transcripts were predicted in the bamboo genome. Compared with the transcripts in the newly released version 2.0 of the bamboo genome, these novel transcripts are tissue-specific and shorter, and most have a single exon. Some representative novel transcripts were validated by semiquantitative RT-PCR and qRT-PCR analyses. Furthermore, we put these novel transcripts back into the ChIP-seq analysis pipeline and discovered that the percentages of H3K4me3 in genic elements were increased. Overall, this work integrated transcriptomic data and epigenomic data to refine the annotation of the genome in order to discover more functional genes and study bamboo growth and development, and the application of this predicted pipeline may help refine the structural annotation of the genome in other species.
竹子是全球最重要的非木材森林资源之一,具有快速生长的能力。近年来,毛竹()的基因组已被解码,并且大量转录组数据已被发表。在本研究中,我们生成了竹子叶、茎和根组织中组蛋白修饰H3K4me3的全基因组图谱。其分布模式趋势与水稻中的相似。我们开发了一种处理流程,用于使用H3K4me3 ChIP-seq数据和29个RNA-seq数据集预测新转录本,以完善基因组的结构注释。结果,在竹子基因组中预测到12,460个新转录本。与竹子基因组新发布的2.0版本中的转录本相比,这些新转录本具有组织特异性且较短,并且大多数只有一个外显子。一些代表性的新转录本通过半定量RT-PCR和qRT-PCR分析得到验证。此外,我们将这些新转录本放回ChIP-seq分析流程中,发现基因元件中H3K4me3的百分比增加了。总体而言,这项工作整合了转录组数据和表观基因组数据以完善基因组注释,从而发现更多功能基因并研究竹子的生长发育,并且这种预测流程的应用可能有助于完善其他物种基因组的结构注释。