Suppr超能文献

基因结构保守性有助于基于相似性的基因预测。

Gene structure conservation aids similarity based gene prediction.

作者信息

Meyer Irmtraud M, Durbin Richard

机构信息

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK.

出版信息

Nucleic Acids Res. 2004 Feb 4;32(2):776-83. doi: 10.1093/nar/gkh211. Print 2004.

Abstract

One of the primary tasks in deciphering the functional contents of a newly sequenced genome is the identification of its protein coding genes. Existing computational methods for gene prediction include ab initio methods which use the DNA sequence itself as the only source of information, comparative methods using multiple genomic sequences, and similarity based methods which employ the cDNA or protein sequences of related genes to aid the gene prediction. We present here an algorithm implemented in a computer program called Projector which combines comparative and similarity approaches. Projector employs similarity information at the genomic DNA level by directly using known genes annotated on one DNA sequence to predict the corresponding related genes on another DNA sequence. It therefore makes explicit use of the conservation of the exon-intron structure between two related genes in addition to the similarity of their encoded amino acid sequences. We evaluate the performance of Projector by comparing it with the program Genewise on a test set of 491 pairs of independently confirmed mouse and human genes. It is more accurate than Genewise for genes whose proteins are <80% identical, and is suitable for use in a combined gene prediction system where other methods identify well conserved and non-conserved genes, and pseudogenes.

摘要

解读新测序基因组的功能内容的主要任务之一是识别其蛋白质编码基因。现有的基因预测计算方法包括:从头开始的方法,即仅将DNA序列本身作为信息来源;比较方法,使用多个基因组序列;以及基于相似性的方法,利用相关基因的cDNA或蛋白质序列辅助基因预测。我们在此展示一种在名为Projector的计算机程序中实现的算法,该算法结合了比较法和相似性方法。Projector通过直接利用注释在一个DNA序列上的已知基因来预测另一个DNA序列上的相应相关基因,从而在基因组DNA水平利用相似性信息。因此,除了编码氨基酸序列的相似性外,它还明确利用了两个相关基因之间外显子 - 内含子结构的保守性。我们通过在由491对经独立确认的小鼠和人类基因组成的测试集上,将Projector与Genewise程序进行比较,来评估Projector的性能。对于蛋白质相似度小于80%的基因,它比Genewise更准确,并且适用于组合基因预测系统,在该系统中其他方法可识别高度保守和非保守基因以及假基因。

相似文献

8
Achieving Accurate Sequence and Annotation Data for Caulobacter vibrioides CB13.获取新月柄杆菌CB13的准确序列和注释数据。
Curr Microbiol. 2018 Dec;75(12):1642-1648. doi: 10.1007/s00284-018-1572-3. Epub 2018 Sep 26.

引用本文的文献

4
Whole-Genome Alignment and Comparative Annotation.全基因组比对和注释。
Annu Rev Anim Biosci. 2019 Feb 15;7:41-64. doi: 10.1146/annurev-animal-020518-115005. Epub 2018 Oct 31.
8
Physico-chemical fingerprinting of RNA genes.RNA基因的物理化学指纹图谱
Nucleic Acids Res. 2017 Apr 20;45(7):e47. doi: 10.1093/nar/gkw1236.
9
Computational Identification of Novel Genes: Current and Future Perspectives.新基因的计算识别:现状与未来展望
Bioinform Biol Insights. 2016 Aug 1;10:121-31. doi: 10.4137/BBI.S39950. eCollection 2016.

本文引用的文献

3
Comparative gene prediction in human and mouse.人类与小鼠的基因预测比较
Genome Res. 2003 Jan;13(1):108-17. doi: 10.1101/gr.871403.
8
The Ensembl genome database project.Ensembl基因组数据库项目。
Nucleic Acids Res. 2002 Jan 1;30(1):38-41. doi: 10.1093/nar/30.1.38.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验