The Key Laboratory of Biology and Genetic Improvement of Oil Crops, the Ministry of Agriculture and Rural Affairs of the PRC, Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Wuhan, China.
Nextomics Biosciences, Wuhan, 430000, Hubei, China.
Plant J. 2020 Jul;103(2):843-857. doi: 10.1111/tpj.14754. Epub 2020 May 2.
Brassica napus is a recent allopolyploid derived from the hybridization of Brassica rapa (A A ) and Brassica oleracea (C C ). Because of the high sequence similarity between the A and C subgenomes, it is difficult to provide an accurate landscape of the whole transcriptome of B. napus. To overcome this problem, we applied a single-molecule long-read isoform sequencing (Iso-Seq) technique that can produce long reads to explore the complex transcriptome of B. napus at the isoform level. From the Iso-Seq data, we obtained 147 698 non-redundant isoforms, capturing 37 403 annotated genes. A total of 18.1% (14 934/82 367) of the multi-exonic genes showed alternative splicing (AS). In addition, we identified 549 long non-coding RNAs, the majority of which displayed tissue-specific expression profiles, and detected 7742 annotated genes that possessed isoforms containing alternative polyadenylation sites. Moreover, 31 591 AS events located in open reading frames (ORFs) lead to potential protein isoforms by in-frame or frameshift changes in the ORF. Illumina RNA sequencing of five tissues that were pooled for Iso-Seq was also performed and showed that 69% of the AS events were tissue-specific. Our data provide abundant transcriptome resources for a transcript isoform catalog of B. napus, which will facilitate genome reannotation, strengthen our understanding of the B. napus transcriptome and be applied for further functional genomic research.
甘蓝型油菜是由白菜型油菜(A A )和甘蓝型油菜(C C )杂交形成的异源多倍体。由于 A 亚基因组和 C 亚基因组之间具有高度的序列相似性,因此很难提供甘蓝型油菜整个转录组的准确全景。为了解决这个问题,我们应用了一种单分子长读长异构体测序(Iso-Seq)技术,该技术可以产生长读长,从而在异构体水平上探索甘蓝型油菜复杂的转录组。从 Iso-Seq 数据中,我们获得了 147698 个非冗余异构体,捕获了 37403 个注释基因。总共 18.1%(14934/82367)的多外显子基因表现出可变剪接(AS)。此外,我们鉴定了 549 个长非编码 RNA,其中大多数显示组织特异性表达谱,并检测到 7742 个具有可变多聚腺苷酸化位点的异构体的注释基因。此外,31591 个位于开放阅读框(ORF)中的 AS 事件通过 ORF 中的框移或移码变化导致潜在的蛋白质异构体。还对用于 Iso-Seq 的五个组织的 Illumina RNA 测序进行了分析,结果表明 69%的 AS 事件具有组织特异性。我们的数据为甘蓝型油菜的转录本异构体目录提供了丰富的转录组资源,这将有助于基因组重新注释,加强我们对甘蓝型油菜转录组的理解,并应用于进一步的功能基因组研究。