Suppr超能文献

哺乳动物序列基因发现程序评估

Evaluation of gene-finding programs on mammalian sequences.

作者信息

Rogic S, Mackworth A K, Ouellette F B

机构信息

Computer Science Department, The University of California at Santa Cruz, Santa Cruz 95064, USA.

出版信息

Genome Res. 2001 May;11(5):817-32. doi: 10.1101/gr.147901.

Abstract

We present an independent comparative analysis of seven recently developed gene-finding programs: FGENES, GeneMark.hmm, Genie, Genescan, HMMgene, Morgan, and MZEF. For evaluation purposes we developed a new, thoroughly filtered, and biologically validated dataset of mammalian genomic sequences that does not overlap with the training sets of the programs analyzed. Our analysis shows that the new generation of programs has substantially better results than the programs analyzed in previous studies. The accuracy of the programs was also examined as a function of various sequence and prediction features, such as G + C content of the sequence, length and type of exons, signal type, and score of the exon prediction. This approach pinpoints the strengths and weaknesses of each individual program as well as those of computational gene-finding in general. The dataset used in this analysis (HMR195) as well as the tables with the complete results are available at http://www.cs.ubc.ca/~rogic/evaluation/.

摘要

我们对最近开发的七个基因发现程序进行了独立的比较分析

FGENES、GeneMark.hmm、Genie、Genescan、HMMgene、Morgan和MZEF。为了进行评估,我们开发了一个新的、经过全面筛选且经过生物学验证的哺乳动物基因组序列数据集,该数据集与所分析程序的训练集不重叠。我们的分析表明,新一代程序的结果比先前研究中分析的程序有显著更好的表现。还根据各种序列和预测特征(如序列的G + C含量、外显子的长度和类型、信号类型以及外显子预测得分)对程序的准确性进行了检验。这种方法明确了每个程序以及一般计算基因发现的优势和劣势。本分析中使用的数据集(HMR195)以及包含完整结果的表格可在http://www.cs.ubc.ca/~rogic/evaluation/获取。

相似文献

5
Gene identification programs in bread wheat: a comparison study.面包小麦中的基因鉴定程序:一项比较研究。
Nucleosides Nucleotides Nucleic Acids. 2013;32(10):529-54. doi: 10.1080/15257770.2013.832773.
6
An analysis of gene-finding programs for Neurospora crassa.粗糙脉孢菌基因查找程序分析
Bioinformatics. 2001 Oct;17(10):901-12. doi: 10.1093/bioinformatics/17.10.901.
7
Using MZEF to find internal coding exons.使用MZEF查找内部编码外显子。
Curr Protoc Bioinformatics. 2002 Aug;Chapter 4:Unit 4.2. doi: 10.1002/0471250953.bi0402s00.
10
Genie--gene finding in Drosophila melanogaster.精灵——黑腹果蝇中的基因发现
Genome Res. 2000 Apr;10(4):529-38. doi: 10.1101/gr.10.4.529.

引用本文的文献

1
First Steps in the Analysis of Prokaryotic Pan-Genomes.原核生物泛基因组分析的初步步骤
Bioinform Biol Insights. 2020 Aug 7;14:1177932220938064. doi: 10.1177/1177932220938064. eCollection 2020.
5
Short Exon Detection via Wavelet Transform Modulus Maxima.基于小波变换模极大值的短外显子检测
PLoS One. 2016 Sep 16;11(9):e0163088. doi: 10.1371/journal.pone.0163088. eCollection 2016.

本文引用的文献

1
The gene identification problem: an overview for developers.基因识别问题:开发者概述
Comput Chem. 1996 Mar;20(1):103-18. doi: 10.1016/s0097-8485(96)80012-x.
2
Protein-length distributions for the three domains of life.生命三个域的蛋白质长度分布。
Trends Genet. 2000 Mar;16(3):107-9. doi: 10.1016/s0168-9525(99)01922-8.
3
Frequent alternative splicing of human genes.人类基因频繁的可变剪接。
Genome Res. 1999 Dec;9(12):1288-93. doi: 10.1101/gr.9.12.1288.
4
GenBank.基因银行
Nucleic Acids Res. 2000 Jan 1;28(1):15-8. doi: 10.1093/nar/28.1.15.
5
The DNA sequence of human chromosome 22.人类22号染色体的DNA序列。
Nature. 1999 Dec 2;402(6761):489-95. doi: 10.1038/990031.
7
A decision tree system for finding genes in DNA.一种用于在DNA中寻找基因的决策树系统。
J Comput Biol. 1998 Winter;5(4):667-80. doi: 10.1089/cmb.1998.5.667.
10
Finding the genes in genomic DNA.在基因组DNA中寻找基因。
Curr Opin Struct Biol. 1998 Jun;8(3):346-54. doi: 10.1016/s0959-440x(98)80069-9.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验