ParsEval：基因结构注释的并行比较和分析。

ParsEval: parallel comparison and analysis of gene structure annotations.

机构信息

Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, Iowa 50011, USA.

出版信息

BMC Bioinformatics. 2012 Aug 1;13:187. doi: 10.1186/1471-2105-13-187.

DOI:10.1186/1471-2105-13-187

PMID:22852583

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3439248/

Abstract

BACKGROUND

Accurate gene structure annotation is a fundamental but somewhat elusive goal of genome projects, as witnessed by the fact that (model) genomes typically undergo several cycles of re-annotation. In many cases, it is not only different versions of annotations that need to be compared but also different sources of annotation of the same genome, derived from distinct gene prediction workflows. Such comparisons are of interest to annotation providers, prediction software developers, and end-users, who all need to assess what is common and what is different among distinct annotation sources. We developed ParsEval, a software application for pairwise comparison of sets of gene structure annotations. ParsEval calculates several statistics that highlight the similarities and differences between the two sets of annotations provided. These statistics are presented in an aggregate summary report, with additional details provided as individual reports specific to non-overlapping, gene-model-centric genomic loci. Genome browser styled graphics embedded in these reports help visualize the genomic context of the annotations. Output from ParsEval is both easily read and parsed, enabling systematic identification of problematic gene models for subsequent focused analysis.

RESULTS

ParsEval is capable of analyzing annotations for large eukaryotic genomes on typical desktop or laptop hardware. In comparison to existing methods, ParsEval exhibits a considerable performance improvement, both in terms of runtime and memory consumption. Reports from ParsEval can provide relevant biological insights into the gene structure annotations being compared.

CONCLUSIONS

Implemented in C, ParsEval provides the quickest and most feature-rich solution for genome annotation comparison to date. The source code is freely available (under an ISC license) at http://parseval.sourceforge.net/.

摘要

背景

准确的基因结构注释是基因组计划的基本但有些难以捉摸的目标，这从（模型）基因组通常需要经历几个周期的重新注释就可以看出。在许多情况下，不仅需要比较不同版本的注释，还需要比较同一基因组的不同注释来源，这些来源来自不同的基因预测工作流程。这些比较对于注释提供者、预测软件开发商和最终用户都很感兴趣，他们都需要评估不同注释来源之间的共同点和不同点。我们开发了 ParsEval，这是一种用于基因结构注释集的两两比较的软件应用程序。ParsEval 计算了几个突出两个注释集之间相似性和差异性的统计数据。这些统计数据以汇总摘要报告的形式呈现，并为非重叠的、以基因模型为中心的基因组区域提供了特定的详细报告。这些报告中嵌入的基因组浏览器样式的图形有助于可视化注释的基因组上下文。ParsEval 的输出既易于阅读又易于解析，能够系统地识别有问题的基因模型，以便进行后续的重点分析。

结果

ParsEval 能够在典型的桌面或笔记本电脑硬件上分析大型真核生物基因组的注释。与现有方法相比，ParsEval 在运行时间和内存消耗方面都有显著的性能提升。ParsEval 的报告可以为正在比较的基因结构注释提供相关的生物学见解。

结论

ParsEval 用 C 语言实现，是迄今为止用于基因组注释比较的最快和功能最丰富的解决方案。源代码可在 http://parseval.sourceforge.net/ （根据 ISC 许可证）免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e6b4/3439248/1f9bdb34dc8e/1471-2105-13-187-1.jpg

相似文献

ParsEval: parallel comparison and analysis of gene structure annotations.ParsEval：基因结构注释的并行比较和分析。

BMC Bioinformatics. 2012 Aug 1;13:187. doi: 10.1186/1471-2105-13-187.

BEACON: automated tool for Bacterial GEnome Annotation ComparisON.BEACON：细菌基因组注释比较自动化工具。

BMC Genomics. 2015 Aug 18;16(1):616. doi: 10.1186/s12864-015-1826-4.

zDB: bacterial comparative genomics made easy.zDB：轻松进行细菌比较基因组学研究。

mSystems. 2024 Jul 23;9(7):e0047324. doi: 10.1128/msystems.00473-24. Epub 2024 Jun 28.

GO FEAT: a rapid web-based functional annotation tool for genomic and transcriptomic data.GO FEAT：一个快速的基于网络的基因组和转录组数据功能注释工具。

Sci Rep. 2018 Jan 29;8(1):1794. doi: 10.1038/s41598-018-20211-9.

High-throughput comparison, functional annotation, and metabolic modeling of plant genomes using the PlantSEED resource.利用 PlantSEED 资源进行高通量比较、功能注释和植物基因组代谢建模。

Proc Natl Acad Sci U S A. 2014 Jul 1;111(26):9645-50. doi: 10.1073/pnas.1401329111. Epub 2014 Jun 9.

GASS: genome structural annotation for Eukaryotes based on species similarity.GASS：基于物种相似性的真核生物基因组结构注释

BMC Genomics. 2015 Mar 4;16(1):150. doi: 10.1186/s12864-015-1353-3.

G-OnRamp: Generating genome browsers to facilitate undergraduate-driven collaborative genome annotation.G-OnRamp：生成基因组浏览器，以促进本科生驱动的协作基因组注释。

PLoS Comput Biol. 2020 Jun 4;16(6):e1007863. doi: 10.1371/journal.pcbi.1007863. eCollection 2020 Jun.

Using computational predictions to improve literature-based Gene Ontology annotations: a feasibility study.利用计算预测改进基于文献的基因本体论注释：一项可行性研究。

Database (Oxford). 2011 Mar 15;2011:bar004. doi: 10.1093/database/bar004. Print 2011.

Beav: a bacterial genome and mobile element annotation pipeline.Beav：细菌基因组和移动元件注释流水线。

mSphere. 2024 Aug 28;9(8):e0020924. doi: 10.1128/msphere.00209-24. Epub 2024 Jul 22.

引用本文的文献

Repeat-induced point mutations driving Parastagonospora nodorum genomic diversity are balanced by selection against non-synonymous mutations.重复诱导的点突变驱动小麦根腐平脐蠕孢基因组多样性，这种多样性通过对非同义突变的选择而得到平衡。

Commun Biol. 2024 Dec 4;7(1):1614. doi: 10.1038/s42003-024-07327-7.

Detection and characterization of constitutive replication origins defined by DNA polymerase epsilon.检测和鉴定 DNA 聚合酶 ε 定义的组成型复制起点。

BMC Biol. 2023 Feb 24;21(1):41. doi: 10.1186/s12915-023-01527-z.

iLoci: robust evaluation of genome content and organization for provisional and mature genome assemblies.iLoci：对临时和成熟基因组组装的基因组内容与组织进行稳健评估。

NAR Genom Bioinform. 2022 Feb 22;4(1):lqac013. doi: 10.1093/nargab/lqac013. eCollection 2022 Mar.

A highly contiguous genome assembly of the bat hawkmoth Hyles vespertilio (Lepidoptera: Sphingidae).蝙蝠天蛾（鳞翅目：天蛾科）高度连续的基因组组装。

Gigascience. 2020 Jan 1;9(1). doi: 10.1093/gigascience/giaa001.

PlanMine 3.0-improvements to a mineable resource of flatworm biology and biodiversity.PlanMine 3.0——扁形动物生物学和生物多样性的可开采资源的改进。

Nucleic Acids Res. 2019 Jan 8;47(D1):D812-D820. doi: 10.1093/nar/gky1070.

Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence.对非经典剪接位点的考虑改进了对拟南芥 Niederzenz-1 基因组序列的基因预测。

BMC Res Notes. 2017 Dec 4;10(1):667. doi: 10.1186/s13104-017-2985-y.

PacBio assembly of a Plasmodium knowlesi genome sequence with Hi-C correction and manual annotation of the SICAvar gene family.利用Hi-C校正和对SICAvar基因家族的人工注释对诺氏疟原虫基因组序列进行PacBio组装。

Parasitology. 2018 Jan;145(1):71-84. doi: 10.1017/S0031182017001329. Epub 2017 Jul 19.

Seqping: gene prediction pipeline for plant genomes using self-training gene models and transcriptomic data.Seqping：使用自训练基因模型和转录组数据的植物基因组基因预测流程

BMC Bioinformatics. 2017 Jan 27;18(Suppl 1):1426. doi: 10.1186/s12859-016-1426-6.

Proteogenomics produces comprehensive and highly accurate protein-coding gene annotation in a complete genome assembly of Malassezia sympodialis.蛋白质基因组学在合轴马拉色菌的全基因组组装中产生了全面且高度准确的蛋白质编码基因注释。

Nucleic Acids Res. 2017 Mar 17;45(5):2629-2643. doi: 10.1093/nar/gkx006.

Companion: a web server for annotation and analysis of parasite genomes.Companion：一个用于寄生虫基因组注释和分析的网络服务器。

Nucleic Acids Res. 2016 Jul 8;44(W1):W29-34. doi: 10.1093/nar/gkw292. Epub 2016 Apr 21.

本文引用的文献

MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.MAKER2：用于第二代基因组项目的注释流水线和基因组数据库管理工具。

BMC Bioinformatics. 2011 Dec 22;12:491. doi: 10.1186/1471-2105-12-491.

Quantitative measures for the management and comparison of annotated genomes.用于注释基因组管理和比较的定量方法。

BMC Bioinformatics. 2009 Feb 23;10:67. doi: 10.1186/1471-2105-10-67.

AnnotationSketch: a genome annotation drawing library.注释草图：一个基因组注释绘图库。

Bioinformatics. 2009 Feb 15;25(4):533-4. doi: 10.1093/bioinformatics/btn657. Epub 2008 Dec 23.

Gene function prediction using labeled and unlabeled data.使用标记和未标记数据进行基因功能预测。

BMC Bioinformatics. 2008 Jan 28;9:57. doi: 10.1186/1471-2105-9-57.

PlantGDB: a resource for comparative plant genomics.植物基因组数据库（PlantGDB）：一个用于比较植物基因组学的资源库。

Nucleic Acids Res. 2008 Jan;36(Database issue):D959-65. doi: 10.1093/nar/gkm1041. Epub 2007 Dec 6.

GFPE: gene-finding program evaluation.GFPE：基因发现程序评估

Bioinformatics. 2003 Sep 1;19(13):1712-3. doi: 10.1093/bioinformatics/btg216.

Eval: a software package for analysis of genome annotations.Eval：一个用于分析基因组注释的软件包。

BMC Bioinformatics. 2003 Oct 17;4:50. doi: 10.1186/1471-2105-4-50.

Gene prediction with a hidden Markov model and a new intron submodel.基于隐马尔可夫模型和新型内含子子模型的基因预测

Bioinformatics. 2003 Oct;19 Suppl 2:ii215-25. doi: 10.1093/bioinformatics/btg1080.

An assessment of gene prediction accuracy in large DNA sequences.大型DNA序列中基因预测准确性的评估。

Genome Res. 2000 Oct;10(10):1631-42. doi: 10.1101/gr.122800.

Evaluation of gene structure prediction programs.基因结构预测程序的评估。

Genomics. 1996 Jun 15;34(3):353-67. doi: 10.1006/geno.1996.0298.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

ParsEval：基因结构注释的并行比较和分析。

ParsEval: parallel comparison and analysis of gene structure annotations.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献