Suppr超能文献

利用蛋白质结构域同源性高效预测可变剪接形式

Efficient prediction of alternative splice forms using protein domain homology.

作者信息

Hiller Michael, Backofen Rolf, Heymann Stephan, Busch Anke, Glaesser Timo Mika, Freytag Johann-Christoph

机构信息

Friedrich-Schiller-Universitaet Jena, Institute of Computer Science, Chair for Bioinformatics, Ernst-Abbe-Platz 1-4, D-07743 Jena, Germany.

出版信息

In Silico Biol. 2004;4(2):195-208.

Abstract

Alternative splicing can yield manifold different mature mRNAs from one precursor. New findings indicate that alternative splicing occurs much more often than previously assumed. A major goal of functional genomics lies in elucidating and characterizing the entire spectrum of alternative splice forms. Existing approaches such as EST-alignments focus only on the mRNA sequence to detect alternative splice forms. They do not consider function and characteristics of the resulting proteins. One important example of such functional characterization is homology to a known protein domain family. A powerful description of protein domains are profile Hidden Markov models (HMM) as stored in the Pfam database. In this paper we address the problem of identifying the splice form with the highest similarity to a protein domain family. Therefore, we take into consideration all possible splice forms. As demonstrated here for a number of genes, this homology based approach can be used successfully for predicting partial gene structures. Furthermore, we present some novel splice form predictions with high-scoring protein domain homology and point out that the detection of splice form specific protein domains helps to answer questions concerning hereditary diseases. Simple approaches based on a BLASTP search cannot be applied here, since the number of possible splice forms increases exponentially with the number of exons. To this end, we have developed an efficient polynomial-time algorithm, called ASFPred (Alternative Splice Form Prediction). This algorithm needs only a set of exons as input.

摘要

可变剪接可从一个前体产生多种不同的成熟mRNA。新的研究结果表明,可变剪接的发生频率比以前认为的要高得多。功能基因组学的一个主要目标在于阐明和表征可变剪接形式的整个谱系。现有的方法,如EST比对,仅专注于mRNA序列以检测可变剪接形式。它们没有考虑所得蛋白质的功能和特征。这种功能表征的一个重要例子是与已知蛋白质结构域家族的同源性。蛋白质结构域的一个有力描述是存储在Pfam数据库中的轮廓隐马尔可夫模型(HMM)。在本文中,我们解决了识别与蛋白质结构域家族具有最高相似性的剪接形式的问题。因此,我们考虑了所有可能的剪接形式。正如这里针对多个基因所展示的,这种基于同源性的方法可以成功地用于预测部分基因结构。此外,我们展示了一些具有高分蛋白质结构域同源性的新型剪接形式预测,并指出剪接形式特异性蛋白质结构域的检测有助于回答有关遗传性疾病的问题。基于BLASTP搜索的简单方法在此处无法应用,因为可能的剪接形式数量会随着外显子数量呈指数增长。为此,我们开发了一种高效的多项式时间算法,称为ASFPred(可变剪接形式预测)。该算法仅需要一组外显子作为输入。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验