Suppr超能文献

MAKER2:用于第二代基因组项目的注释流水线和基因组数据库管理工具。

MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

机构信息

Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA.

出版信息

BMC Bioinformatics. 2011 Dec 22;12:491. doi: 10.1186/1471-2105-12-491.

Abstract

BACKGROUND

Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies.

RESULTS

We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review.

CONCLUSIONS

MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.

摘要

背景

第二代测序技术正在引发重大转变,涉及到正在测序的基因组类型以及它们的注释方式。第一代基因组项目专注于研究充分的模式生物,而今天的许多项目涉及到外来生物,它们的基因组在很大程度上是未知的。这使得它们的注释变得复杂,因为与第一代项目不同,没有预先存在的“黄金标准”基因模型来训练基因预测器。基因组组装的改进和广泛可用的 mRNA-seq 数据也为更新和重新注释以前发表的基因组注释创造了机会。如今的基因组项目因此需要新的基因组注释工具,以应对第二代测序技术带来的挑战和机遇。

结果

我们提出了 MAKER2,这是一种专为第二代基因组项目设计的基因组注释和数据管理工具。MAKER2 是一个多线程、并行化的应用程序,可以处理几乎任何大小的第二代数据集。我们表明,MAKER2 可以在训练数据有限、质量低甚至不存在的情况下,为新的基因组生成准确的注释。MAKER2 还提供了一种使用 mRNA-seq 数据来提高注释质量的简便方法;它可以使用这些数据来更新旧的注释,显著提高其质量。我们还表明,MAKER2 可以评估基因组注释的质量,并识别和优先考虑需要手动审查的有问题的注释。

结论

MAKER2 是第一个专门为第二代基因组项目设计的注释引擎。MAKER2 可以扩展到任何大小的数据集,只需要很少的训练数据,并且可以使用 mRNA-seq 数据来提高注释质量。它还可以更新和管理旧的基因组注释数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b3a6/3280279/517fb501c71c/1471-2105-12-491-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验