Suppr超能文献

一种用于人类DNA序列中外显子识别和基因建模的改进系统。

An improved system for exon recognition and gene modeling in human DNA sequences.

作者信息

Xu Y, Einstein J R, Mural R J, Shah M, Uberbacher E C

机构信息

Engineering Physics and Mathematics Division, Oak Ridge National Laboratory, TN 37831-6364, USA.

出版信息

Proc Int Conf Intell Syst Mol Biol. 1994;2:376-84.

PMID:7584416
Abstract

A new version of the GRAIL system (Uberbacher and Mural, 1991; Mural et al., 1992; Uberbacher et al., 1993), called GRAIL II, has recently been developed (Xu et al., 1994). GRAIL II is a hybrid AI system that supports a number of DNA sequence analysis tools including protein-coding region recognition, PolyA site and transcription promoter recognition, gene model construction, translation to protein, and DNA/protein database searching capabilities. This paper presents the core of GRAIL II, the coding exon recognition and gene model construction algorithms. The exon recognition algorithm recognizes coding exons by combining coding feature analysis and edge signal (acceptor/donor/translation-start sites) detection. Unlike the original GRAIL system (Uberbacher and Mural, 1991; Mural et al., 1992), this algorithm uses variable-length windows tailored to each potential exon candidate, making its performance almost exon length-independent. In this algorithm, the recognition process is divided into four steps. Initially a large number of possible coding exon candidates are generated. Then a rule-based prescreening algorithm eliminates the majority of the improbable candidates. As the kernel of the recognition algorithm, three neural networks are trained to evaluate the remaining candidates. The outputs of the neural networks are then divided into clusters of candidates, corresponding to presumed exons. The algorithm makes its final prediction by picking the best canadidate from each cluster. The gene construction algorithm (Xu, Mural and Uberbacher, 1994) uses a dynamic programming approach to build gene models by using as input the clusters predicted by the exon recognition algorithm. Extensive testing has been done on these two algorithms.(ABSTRACT TRUNCATED AT 250 WORDS)

摘要

一种名为GRAIL II的GRAIL系统新版本(Uberbacher和Mural,1991年;Mural等人,1992年;Uberbacher等人,1993年)最近已被开发出来(Xu等人,1994年)。GRAIL II是一个混合人工智能系统,支持多种DNA序列分析工具,包括蛋白质编码区识别、聚腺苷酸化位点和转录启动子识别、基因模型构建、蛋白质翻译以及DNA/蛋白质数据库搜索功能。本文介绍了GRAIL II的核心,即编码外显子识别和基因模型构建算法。外显子识别算法通过结合编码特征分析和边缘信号(受体/供体/翻译起始位点)检测来识别编码外显子。与原始的GRAIL系统(Uberbacher和Mural,1991年;Mural等人,1992年)不同,该算法使用为每个潜在外显子候选量身定制的可变长度窗口,使其性能几乎与外显子长度无关。在该算法中,识别过程分为四个步骤。首先生成大量可能的编码外显子候选。然后,基于规则的预筛选算法消除大多数不太可能的候选。作为识别算法的核心,训练三个神经网络来评估剩余的候选。然后将神经网络的输出分为候选簇,对应于假定的外显子。该算法通过从每个簇中挑选最佳候选来做出最终预测。基因构建算法(Xu、Mural和Uberbacher,1994年)使用动态规划方法,将外显子识别算法预测的簇作为输入来构建基因模型。已对这两种算法进行了广泛测试。(摘要截取自250字)

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验