Loraine Ann E, Helt Gregg A, Cline Melissa S, Siani-Rose Michael A
Affymetrix, Inc., Emeryville, CA 94608, USA.
Proc IEEE Comput Soc Bioinform Conf. 2002;1:118-24.
Understanding the functional significance of alternative splicing and other mechanisms that generate RNA transcript diversity is an important challenge facing modern-day molecular biology. Using homology-based, protein sequence analysis methods, it should be possible to investigate how transcript diversity impacts protein structure and function. To test this, a data mining technique ("DiffHit") was developed to identify and catalog genes producing protein isoforms which exhibit distinct profiles of conserved protein motifs. We found that out of a test set of over 1,300 alternatively spliced genes with solved genomic structure, over 30% exhibited a differential profile of conserved InterPro and/or Blocks protein motifs across distinct isoforms. These results suggest that motif databases such as Blocks and InterPro are potentially useful tools for investigating how alternative transcript structure affects gene function.
理解可变剪接及其他产生RNA转录本多样性机制的功能意义是现代分子生物学面临的一项重要挑战。利用基于同源性的蛋白质序列分析方法,应该能够研究转录本多样性如何影响蛋白质结构和功能。为了验证这一点,开发了一种数据挖掘技术(“DiffHit”)来识别和编目产生具有不同保守蛋白质基序谱的蛋白质异构体的基因。我们发现,在一组超过1300个具有已解析基因组结构的可变剪接基因的测试集中,超过30%的基因在不同异构体中表现出保守的InterPro和/或Blocks蛋白质基序的差异谱。这些结果表明,诸如Blocks和InterPro等基序数据库是研究可变转录本结构如何影响基因功能的潜在有用工具。