对比法：一种用于多信息源从头基因预测的无系统发育的判别方法。

CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction.

作者信息

Gross Samuel S, Do Chuong B, Sirota Marina, Batzoglou Serafim

机构信息

Computer Science Department, Stanford University, Stanford, CA, USA.

出版信息

Genome Biol. 2007;8(12):R269. doi: 10.1186/gb-2007-8-12-r269.

DOI:10.1186/gb-2007-8-12-r269

PMID:18096039

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2246271/

Abstract

We describe CONTRAST, a gene predictor which directly incorporates information from multiple alignments rather than employing phylogenetic models. This is accomplished through the use of discriminative machine learning techniques, including a novel training algorithm. We use a two-stage approach, in which a set of binary classifiers designed to recognize coding region boundaries is combined with a global model of gene structure. CONTRAST predicts exact coding region structures for 65% more human genes than the previous state-of-the-art method, misses 46% fewer exons and displays comparable gains in specificity.

摘要

我们介绍了CONTRAST，一种基因预测器，它直接整合来自多序列比对的信息，而不是采用系统发育模型。这是通过使用判别式机器学习技术来实现的，包括一种新颖的训练算法。我们采用两阶段方法，将一组旨在识别编码区边界的二元分类器与基因结构的全局模型相结合。与之前的最先进方法相比，CONTRAST能为多65%的人类基因预测出精确的编码区结构，遗漏的外显子数量减少46%，并且在特异性方面也有类似的提升。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6c27/2246271/dbbcee664955/gb-2007-8-12-r269-1.jpg

相似文献

CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction.

Genome Biol. 2007;8(12):R269. doi: 10.1186/gb-2007-8-12-r269.

Using ESTs to improve the accuracy of de novo gene prediction.

BMC Bioinformatics. 2006 Jul 3;7:327. doi: 10.1186/1471-2105-7-327.

Gene prediction: compare and CONTRAST.

Genome Biol. 2007;8(12):233. doi: 10.1186/gb-2007-8-12-233.

Vertebrate gene finding from multiple-species alignments using a two-level strategy.

Genome Biol. 2006;7 Suppl 1(Suppl 1):S6.1-12. doi: 10.1186/gb-2006-7-s1-s6. Epub 2006 Aug 7.

SVM-Fold: a tool for discriminative multi-class protein fold and superfamily recognition.

BMC Bioinformatics. 2007 May 22;8 Suppl 4(Suppl 4):S2. doi: 10.1186/1471-2105-8-S4-S2.

GeneAlign: a coding exon prediction tool based on phylogenetical comparisons.

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W280-4. doi: 10.1093/nar/gkl307.

Global discriminative learning for higher-accuracy computational gene prediction.

PLoS Comput Biol. 2007 Mar 16;3(3):e54. doi: 10.1371/journal.pcbi.0030054. Epub 2007 Feb 2.

Using multiple alignments to improve seeded local alignment algorithms.

Nucleic Acids Res. 2005 Aug 12;33(14):4563-77. doi: 10.1093/nar/gki767. Print 2005.

SPIDER: software for protein identification from sequence tags with de novo sequencing error.

J Bioinform Comput Biol. 2005 Jun;3(3):697-716. doi: 10.1142/s0219720005001247.

Gene structure prediction and alternative splicing analysis using genomically aligned ESTs.

Genome Res. 2001 May;11(5):889-900. doi: 10.1101/gr.155001.

引用本文的文献

Comparative Genome Annotation.

Methods Mol Biol. 2024;2802:165-187. doi: 10.1007/978-1-0716-3838-5_7.

Translation and natural selection of micropeptides from long non-canonical RNAs.

Nat Commun. 2022 Oct 31;13(1):6515. doi: 10.1038/s41467-022-34094-y.

Machine learning in postgenomic biology and personalized medicine.

Wiley Interdiscip Rev Data Min Knowl Discov. 2022 Mar-Apr;12(2). doi: 10.1002/widm.1451. Epub 2022 Jan 24.

Building the Chordata Olfactory Receptor Database using more than 400,000 receptors annotated by Genome2OR.

Sci China Life Sci. 2022 Dec;65(12):2539-2551. doi: 10.1007/s11427-021-2081-6. Epub 2022 Jun 10.

Ancient evolutionary signals of protein-coding sequences allow the discovery of new genes in the Drosophila melanogaster genome.

BMC Genomics. 2020 Mar 5;21(1):210. doi: 10.1186/s12864-020-6632-y.

A genome alignment of 120 mammals highlights ultraconserved element variability and placenta-associated enhancers.

Gigascience. 2020 Jan 1;9(1). doi: 10.1093/gigascience/giz159.

Whole-Genome Alignment and Comparative Annotation.

Annu Rev Anim Biosci. 2019 Feb 15;7:41-64. doi: 10.1146/annurev-animal-020518-115005. Epub 2018 Oct 31.

OMGene: mutual improvement of gene models through optimisation of evolutionary conservation.

BMC Genomics. 2018 Apr 27;19(1):307. doi: 10.1186/s12864-018-4704-z.

A novel codon-based de Bruijn graph algorithm for gene construction from unassembled transcriptomes.

Genome Biol. 2016 Nov 17;17(1):232. doi: 10.1186/s13059-016-1094-x.

Computational Identification of Novel Genes: Current and Future Perspectives.

Bioinform Biol Insights. 2016 Aug 1;10:121-31. doi: 10.4137/BBI.S39950. eCollection 2016.

本文引用的文献

Conrad: gene prediction using conditional random fields.

Genome Res. 2007 Sep;17(9):1389-98. doi: 10.1101/gr.6558107. Epub 2007 Aug 9.

Global discriminative learning for higher-accuracy computational gene prediction.

PLoS Comput Biol. 2007 Mar 16;3(3):e54. doi: 10.1371/journal.pcbi.0030054. Epub 2007 Feb 2.

Exogean: a framework for annotating protein-coding genes in eukaryotic genomic DNA.

Genome Biol. 2006;7 Suppl 1(Suppl 1):S7.1-10. doi: 10.1186/gb-2006-7-s1-s7. Epub 2006 Aug 7.

Vertebrate gene finding from multiple-species alignments using a two-level strategy.

Genome Biol. 2006;7 Suppl 1(Suppl 1):S6.1-12. doi: 10.1186/gb-2006-7-s1-s6. Epub 2006 Aug 7.

Pairagon+N-SCAN_EST: a model-based gene annotation pipeline.

Genome Biol. 2006;7 Suppl 1(Suppl 1):S5.1-10. doi: 10.1186/gb-2006-7-s1-s5. Epub 2006 Aug 7.

EGASP: the human ENCODE Genome Annotation Assessment Project.

Genome Biol. 2006;7 Suppl 1(Suppl 1):S2.1-31. doi: 10.1186/gb-2006-7-s1-s2. Epub 2006 Aug 7.

CONTRAfold: RNA secondary structure prediction without physics-based models.

Bioinformatics. 2006 Jul 15;22(14):e90-8. doi: 10.1093/bioinformatics/btl246.

Using ESTs to improve the accuracy of de novo gene prediction.

BMC Bioinformatics. 2006 Jul 3;7:327. doi: 10.1186/1471-2105-7-327.

Using multiple alignments to improve gene prediction.

J Comput Biol. 2006 Mar;13(2):379-93. doi: 10.1089/cmb.2006.13.379.

GenBank.

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D16-20. doi: 10.1093/nar/gkj157.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

对比法：一种用于多信息源从头基因预测的无系统发育的判别方法。

CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

对比法：一种用于多信息源从头基因预测的无系统发育的判别方法。

CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献