Gu Lianfeng, Guo Rongfa
College of Agriculture, Guangdong Ocean University, Zhanjiang 524088, China.
J Genet Genomics. 2007 Mar;34(3):247-57. doi: 10.1016/S1673-8527(07)60026-5.
Alternative splicing is a major contributor to genomic complexity and proteome diversity, yet the analysis of alternative splicing for the sequence containing nucleotide binding site and leucine-rich repeats (NBS-LRR) domain has not been explored in rice (Oryza sativa L.). Hidden Markov model (HMM) searches were performed for NBS-LRR domain. 875 NBS-LRR-encoding sequences were obtained from the Institute for Genomic Research (TIGR). All of them were used to blast Knowledge-based Oryza Molecular Biological Encyclopaedia (KOME), TIGR rice gene index (TGI), and Universal Protein Resource (UniProt) to obtain homologous full-length cDNAs (FL-cDNAs), tentative consensus sequences, and protein sequences. Alternative splicing events were detected from genomic alignment of FL-cDNAs, tentative consensus sequences, and protein sequences, which provide valuable information on splice variants of genes. These sequences were aligned to the corresponding BAC sequences using the Spidey and Sim4 programs and each of the proteins was aligned by tBLASTn. Of the 875 NBS-LRR sequences, 119 (13.6%) sequences had alternative splicing where multiple FL-cDNAs, TGI sequences and proteins corresponded to the same gene. 71 intron retention events, 20 exon skipping events, 16 alternative termination events, 25 alternative initiation events, 12 alternative 5' splicing events, and 16 alternative 3' splicing events were identified. Most of these alternative splices were supported by two or more transcripts. The data sets are available at http://www.bioinfor.org Furthermore, the bioinformatics analysis of splice boundaries showed that exon skipping and intron retention did not exhibit strong consensus. This implies a different regulation mechanism that guides the expression of splice isoforms. This article also presents the analysis of the effects of intron retention on proteins. The C-terminal regions of alternative proteins turned out to be more variable than the N-terminal regions. Finally, tissue distribution and protein localization of alternative splicing were explored. The largest categories of tissue distributions for alternative splicing were shoot and callus. More than one-thirds of protein localization for splice forms was plasma membrane and cytoplasm. All the NBS-LRR proteins for splice forms may have important function in disease resistance and activate downstream signaling pathways.
可变剪接是基因组复杂性和蛋白质组多样性的主要贡献因素,然而,水稻(Oryza sativa L.)中含有核苷酸结合位点和富含亮氨酸重复序列(NBS-LRR)结构域的序列的可变剪接分析尚未得到探索。对NBS-LRR结构域进行了隐马尔可夫模型(HMM)搜索。从基因组研究所(TIGR)获得了875个编码NBS-LRR的序列。所有这些序列都用于对基于知识的水稻分子生物学百科全书(KOME)、TIGR水稻基因索引(TGI)和通用蛋白质资源(UniProt)进行比对,以获得同源全长cDNA(FL-cDNA)、初步一致性序列和蛋白质序列。从FL-cDNA、初步一致性序列和蛋白质序列的基因组比对中检测到可变剪接事件,这些事件为基因的剪接变体提供了有价值的信息。使用Spidey和Sim4程序将这些序列与相应的BAC序列进行比对,并通过tBLASTn对每个蛋白质进行比对。在875个NBS-LRR序列中,119个(13.6%)序列存在可变剪接,其中多个FL-cDNA、TGI序列和蛋白质对应于同一个基因。鉴定出71个内含子保留事件、20个外显子跳跃事件、16个可变终止事件、25个可变起始事件、12个可变5'剪接事件和16个可变3'剪接事件。这些可变剪接中的大多数得到了两个或更多转录本的支持。数据集可在http://www.bioinfor.org获取。此外,剪接边界的生物信息学分析表明,外显子跳跃和内含子保留没有表现出强烈的一致性。这意味着存在一种不同的调控机制来指导剪接异构体的表达。本文还介绍了内含子保留对蛋白质影响的分析结果。可变蛋白质的C末端区域比N末端区域更具变异性。最后,探索了可变剪接的组织分布和蛋白质定位情况。可变剪接的最大组织分布类别是茎尖和愈伤组织。剪接形式的蛋白质定位中,超过三分之一是质膜和细胞质。所有剪接形式的NBS-LRR蛋白质可能在抗病性中具有重要功能并激活下游信号通路。