Suppr超能文献

使用基因组数据集评估基因预测软件:应用于拟南芥序列

Evaluation of gene prediction software using a genomic data set: application to Arabidopsis thaliana sequences.

作者信息

Pavy N, Rombauts S, Déhais P, Mathé C, Ramana D V, Leroy P, Rouzé P

机构信息

Laboratoire associé de l'INRA, France.

出版信息

Bioinformatics. 1999 Nov;15(11):887-99. doi: 10.1093/bioinformatics/15.11.887.

Abstract

MOTIVATION

The annotation of the Arabidopsis thaliana genome remains a problem in terms of time and quality. To improve the annotation process, we want to choose the most appropriate tools to use inside a computer-assisted annotation platform. We therefore need evaluation of prediction programs with Arabidopsis sequences containing multiple genes.

RESULTS

We have developed AraSet, a data set of contigs of validated genes, enabling the evaluation of multi-gene models for the Arabidopsis genome. Besides conventional metrics to evaluate gene prediction at the site and the exon levels, new measures were introduced for the prediction at the protein sequence level as well as for the evaluation of gene models. This evaluation method is of general interest and could apply to any new gene prediction software and to any eukaryotic genome. The GeneMark.hmm program appears to be the most accurate software at all three levels for the Arabidopsis genomic sequences. Gene modeling could be further improved by combination of prediction software.

AVAILABILITY

The AraSet sequence set, the Perl programs and complementary results and notes are available at http://sphinx.rug.ac.be:8080/biocomp/napav/.

CONTACT

Pierre.Rouze@gengenp.rug.ac.be.

摘要

动机

拟南芥基因组的注释在时间和质量方面仍然是个问题。为了改进注释过程,我们希望在计算机辅助注释平台内选择最合适的工具。因此,我们需要使用包含多个基因的拟南芥序列对预测程序进行评估。

结果

我们开发了AraSet,这是一个经过验证的基因重叠群数据集,可用于评估拟南芥基因组的多基因模型。除了用于评估基因预测在位点和外显子水平的传统指标外,还引入了用于蛋白质序列水平预测以及基因模型评估的新方法。这种评估方法具有普遍意义,可应用于任何新的基因预测软件和任何真核生物基因组。GeneMark.hmm程序在拟南芥基因组序列的所有三个水平上似乎都是最准确的软件。通过组合预测软件可以进一步改进基因建模。

可用性

AraSet序列集、Perl程序以及补充结果和注释可在http://sphinx.rug.ac.be:8080/biocomp/napav/获取。

联系方式

Pierre.Rouze@gengenp.rug.ac.be

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验