Suppr超能文献

基因组规模上的翻译起始位点预测:简约之美。

Translation initiation site prediction on a genomic scale: beauty in simplicity.

作者信息

Saeys Yvan, Abeel Thomas, Degroeve Sven, Van de Peer Yves

机构信息

Department of Plant Systems Biology, VIB, Technologiepark 927, B-9052 Ghent, Belgium.

出版信息

Bioinformatics. 2007 Jul 1;23(13):i418-23. doi: 10.1093/bioinformatics/btm177.

Abstract

MOTIVATION

The correct identification of translation initiation sites (TIS) remains a challenging problem for computational methods that automatically try to solve this problem. Furthermore, the lion's share of these computational techniques focuses on the identification of TIS in transcript data. However, in the gene prediction context the identification of TIS occurs on the genomic level, which makes things even harder because at the genome level many more pseudo-TIS occur, resulting in models that achieve a higher number of false positive predictions.

RESULTS

In this article, we evaluate the performance of several 'simple' TIS recognition methods at the genomic level, and compare them to state-of-the-art models for TIS prediction in transcript data. We conclude that the simple methods largely outperform the complex ones at the genomic scale, and we propose a new model for TIS recognition at the genome level that combines the strengths of these simple models. The new model obtains a false positive rate of 0.125 at a sensitivity of 0.80 on a well annotated human chromosome (chromosome 21). Detailed analyses show that the model is useful, both on its own and in a simple gene prediction setting.

AVAILABILITY

Datafiles and a web interface for the StartScan program are available at http://bioinformatics.psb.ugent.be/supplementary_data/.

摘要

动机

对于试图自动解决该问题的计算方法而言,正确识别翻译起始位点(TIS)仍然是一个具有挑战性的问题。此外,这些计算技术大多集中于在转录本数据中识别TIS。然而,在基因预测背景下,TIS的识别是在基因组水平上进行的,这使得情况变得更加困难,因为在基因组水平上会出现更多的假TIS,导致模型产生更高数量的假阳性预测。

结果

在本文中,我们在基因组水平上评估了几种“简单”的TIS识别方法的性能,并将它们与转录本数据中TIS预测的最先进模型进行比较。我们得出结论,在基因组规模上,简单方法在很大程度上优于复杂方法,并且我们提出了一种在基因组水平上识别TIS的新模型,该模型结合了这些简单模型的优势。在一条注释良好的人类染色体(21号染色体)上,新模型在灵敏度为0.80时的假阳性率为0.125。详细分析表明,该模型本身以及在简单的基因预测设置中都是有用的。

可用性

StartScan程序的数据文件和网络界面可在http://bioinformatics.psb.ugent.be/supplementary_data/获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验