Ezoe Akihiro, Iuchi Satoshi, Sakurai Tetsuya, Aso Yukie, Tokunaga Hiroki, Vu Anh Thu, Utsumi Yoshinori, Takahashi Satoshi, Tanaka Maho, Ishida Junko, Ishitani Manabu, Seki Motoaki
Plant Genomic Network Research Team, RIKEN Center for Sustainable Resource Science, Yokohama, Kanagawa, 230-0045, Japan.
Experimental Plant Division, RIKEN BioResource Research Center, Tsukuba, Ibaraki, 305-0074, Japan.
Plant Mol Biol. 2023 May;112(1-2):33-45. doi: 10.1007/s11103-023-01346-4. Epub 2023 Apr 4.
The primary transcript structure provides critical insights into protein diversity, transcriptional modification, and functions. Cassava transcript structures are highly diverse because of alternative splicing (AS) events and high heterozygosity. To precisely determine and characterize transcript structures, fully sequencing cloned transcripts is the most reliable method. However, cassava annotations were mainly determined according to fragmentation-based sequencing analyses (e.g., EST and short-read RNA-seq). In this study, we sequenced the cassava full-length cDNA library, which included rare transcripts. We obtained 8,628 non-redundant fully sequenced transcripts and detected 615 unannotated AS events and 421 unannotated loci. The different protein sequences resulting from the unannotated AS events tended to have diverse functional domains, implying that unannotated AS contributes to the truncation of functional domains. The unannotated loci tended to be derived from orphan genes, implying that the loci may be associated with cassava-specific traits. Unexpectedly, individual cassava transcripts were more likely to have multiple AS events than Arabidopsis transcripts, suggestive of the regulated interactions between cassava splicing-related complexes. We also observed that the unannotated loci and/or AS events were commonly in regions with abundant single nucleotide variations, insertions-deletions, and heterozygous sequences. These findings reflect the utility of completely sequenced FLcDNA clones for overcoming cassava-specific annotation-related problems to elucidate transcript structures. Our work provides researchers with transcript structural details that are useful for annotating highly diverse and unique transcripts and alternative splicing events.
初级转录本结构为蛋白质多样性、转录修饰和功能提供了关键见解。由于可变剪接(AS)事件和高杂合性,木薯转录本结构高度多样。为了精确确定和表征转录本结构,对克隆的转录本进行全测序是最可靠的方法。然而,木薯注释主要是根据基于片段的测序分析(如EST和短读长RNA-seq)来确定的。在本研究中,我们对包含稀有转录本的木薯全长cDNA文库进行了测序。我们获得了8628个非冗余的全测序转录本,检测到615个未注释的AS事件和421个未注释的位点。未注释的AS事件产生的不同蛋白质序列往往具有不同的功能结构域,这意味着未注释的AS导致了功能结构域的截断。未注释的位点往往来自孤儿基因,这意味着这些位点可能与木薯的特异性状相关。出乎意料的是,与拟南芥转录本相比,单个木薯转录本更有可能发生多个AS事件,这表明木薯剪接相关复合体之间存在调控相互作用。我们还观察到,未注释的位点和/或AS事件通常位于单核苷酸变异、插入缺失和杂合序列丰富的区域。这些发现反映了完全测序的FLcDNA克隆在克服木薯特异性注释相关问题以阐明转录本结构方面的效用。我们的工作为研究人员提供了转录本结构细节,这些细节有助于注释高度多样和独特的转录本以及可变剪接事件。