Hao Da-Cheng, Chen Hao, Xiao Pei-Gen, Jiang Tao
Biotechnology Institute, School of Environment and Chemical Engineering, Dalian Jiaotong University, Dalian 116028, China.
Institute of Molecular Plant Sciences, University of Edinburgh, Edinburgh EH9 3BF, UK.
Curr Genomics. 2022 Jul 5;23(3):207-216. doi: 10.2174/1389202923666220527112929.
The multiple isoforms are often generated from a single gene Alternative Splicing (AS) in plants, and the functional diversity of the plant genome is significantly increased. Despite well-studied gene functions, the specific functions of isoforms are little known, therefore, the accurate prediction of isoform functions is exceedingly wanted. Here we perform the first global analysis of AS of , a medicinal genus of Ranunculales, by utilizing full-length transcriptome datasets of five Chinese endemic taxa. Multiple software were used to identify AS events, the gene function was annotated based on seven databases, and the protein-coding sequence of each AS isoform was translated into an amino acid sequence. The self-developed software DIFFUSE was used to predict the functions of AS isoforms. Among 8,485 genes with AS events, the genes with two isoforms were the most (6,038), followed by those with three isoforms and four isoforms. Retained intron (RI, 551) was predominant among 1,037 AS events, and alternative 3' splice sites and alternative 5' splice sites were second. The software DIFFUSE was effective in predicting functions of isoforms, which have not been unearthed. When compared with the sequence alignment-based database annotations, DIFFUSE performed better in differentiating isoform functions. The DIFFUSE predictions on the terms GO:0003677 (DNA binding) and GO: 0010333 (terpene synthase activity) agreed with the biological features of transcript isoforms. Numerous AS events were for the first time identified from full-length transcriptome datasets of five taxa, and functions of AS isoforms were successfully predicted by the self-developed software DIFFUSE. The global analysis of AS events and predicting isoform functions can help understand the metabolic regulations of medicinal taxa and their pharmaceutical explorations.
多种异构体通常由植物中的单个基因通过可变剪接(AS)产生,这显著增加了植物基因组的功能多样性。尽管基因功能已得到充分研究,但异构体的具体功能却鲜为人知,因此,异构体功能的准确预测极为迫切。在此,我们通过利用五个中国特有类群的全长转录组数据集,对毛茛目药用属进行了首次全基因组可变剪接分析。使用多种软件识别可变剪接事件,基于七个数据库对基因功能进行注释,并将每个可变剪接异构体的蛋白质编码序列翻译成氨基酸序列。使用自行开发的软件DIFFUSE预测可变剪接异构体的功能。在8485个发生可变剪接事件的基因中,具有两种异构体的基因最多(6038个),其次是具有三种异构体和四种异构体的基因。在1037个可变剪接事件中,保留内含子(RI,551个)占主导地位,其次是可变3'剪接位点和可变5'剪接位点。软件DIFFUSE在预测尚未发现的异构体功能方面很有效。与基于序列比对的数据库注释相比,DIFFUSE在区分异构体功能方面表现更好。DIFFUSE对术语GO:0003677(DNA结合)和GO:0010333(萜烯合酶活性)的预测与转录本异构体的生物学特征相符。首次从五个类群的全长转录组数据集中鉴定出大量可变剪接事件,并通过自行开发的软件DIFFUSE成功预测了可变剪接异构体的功能。可变剪接事件的全基因组分析和异构体功能预测有助于理解药用类群的代谢调控及其药物开发。