Suppr超能文献

外显子猎手:一种全面的基因发现方法。

ExonHunter: a comprehensive approach to gene finding.

作者信息

Brejová Brona, Brown Daniel G, Li Ming, Vinar Tomás

机构信息

School of Computer Science, University of Waterloo 200 University Avenue West, Waterloo, ON, Canada N2L 3G1.

出版信息

Bioinformatics. 2005 Jun;21 Suppl 1:i57-65. doi: 10.1093/bioinformatics/bti1040.

Abstract

MOTIVATION

We present ExonHunter, a new and comprehensive gene finding system that outperforms existing systems and features several new ideas and approaches. Our system combines numerous sources of information (genomic sequences, expressed sequence tags and protein databases of related species) into a gene finder based on a hidden Markov model in a novel and systematic way. In our framework, various sources of information are expressed as partial probabilistic statements about positions in the sequence and their annotation. We then combine these into the final prediction via a quadratic programming method, which we show to be an extension of existing methods. Allowing only partial statements is key to our transparent handling of missing information and coping with the heterogeneous character of individual sources of information. In addition, we give a new method for modeling the length distribution of intergenic regions in hidden Markov models.

RESULTS

On a commonly used test set, ExonHunter performs significantly better than the existing gene finders ROSETTA, SLAM and TWINSCAN, with more than two-thirds of genes predicted completely correctly.

AVAILABILITY

Supplementary material available at http://www.bioinformatics.uwaterloo.ca/supplements/05eh/

摘要

动机

我们展示了ExonHunter,这是一个全新且全面的基因发现系统,其性能优于现有系统,并具有多个新的理念和方法。我们的系统以新颖且系统的方式,将众多信息源(基因组序列、表达序列标签以及相关物种的蛋白质数据库)整合到一个基于隐马尔可夫模型的基因发现器中。在我们的框架下,各种信息源被表示为关于序列中位置及其注释的部分概率陈述。然后,我们通过二次规划方法将这些整合到最终预测中,我们证明这是对现有方法的一种扩展。仅允许部分陈述是我们透明处理缺失信息以及应对各个信息源异质性特征的关键。此外,我们给出了一种在隐马尔可夫模型中对基因间区域长度分布进行建模的新方法。

结果

在一个常用测试集上,ExonHunter的表现显著优于现有基因发现器ROSETTA、SLAM和TWINSCAN,超过三分之二的基因被完全正确预测。

可用性

补充材料可在http://www.bioinformatics.uwaterloo.ca/supplements/05eh/获取

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验