Suppr超能文献

使用配对隐马尔可夫模型对基因结构进行比较从头预测。

Comparative ab initio prediction of gene structures using pair HMMs.

作者信息

Meyer Irmtraud M, Durbin Richard

机构信息

The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

出版信息

Bioinformatics. 2002 Oct;18(10):1309-18. doi: 10.1093/bioinformatics/18.10.1309.

Abstract

We present a novel comparative method for the ab initio prediction of protein coding genes in eukaryotic genomes. The method simultaneously predicts the gene structures of two un-annotated input DNA sequences which are homologous to each other and retrieves the subsequences which are conserved between the two DNA sequences. It is capable of predicting partial, complete and multiple genes and can align pairs of genes which differ by events of exon-fusion or exon-splitting. The method employs a probabilistic pair hidden Markov model. We generate annotations using our model with two different algorithms: the Viterbi algorithm in its linear memory implementation and a new heuristic algorithm, called the stepping stone, for which both memory and time requirements scale linearly with the sequence length. We have implemented the model in a computer program called DOUBLESCAN. In this article, we introduce the method and confirm the validity of the approach on a test set of 80 pairs of orthologous DNA sequences from mouse and human. More information can be found at: http://www.sanger.ac.uk/Software/analysis/doublescan/

摘要

我们提出了一种用于真核生物基因组中从头预测蛋白质编码基因的全新比较方法。该方法可同时预测两条相互同源的未注释输入DNA序列的基因结构,并检索这两条DNA序列之间保守的子序列。它能够预测部分、完整和多个基因,还能比对因外显子融合或外显子分裂事件而不同的基因对。此方法采用概率性双隐马尔可夫模型。我们使用该模型通过两种不同算法生成注释:线性内存实现的维特比算法,以及一种名为“垫脚石”的新启发式算法,这两种算法的内存和时间需求均与序列长度呈线性关系。我们已在名为DOUBLESCAN的计算机程序中实现了该模型。在本文中,我们介绍了该方法,并在一组由80对来自小鼠和人类的直系同源DNA序列组成的测试集上证实了该方法的有效性。更多信息可在以下网址获取:http://www.sanger.ac.uk/Software/analysis/doublescan/

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验