Eccles Institute of Human Genetics , University of Utah, Salt Lake City, Utah 84112.
Plant Physiol. 2014 Feb;164(2):513-24. doi: 10.1104/pp.113.230144. Epub 2013 Dec 4.
We have optimized and extended the widely used annotation engine MAKER in order to better support plant genome annotation efforts. New features include better parallelization for large repeat-rich plant genomes, noncoding RNA annotation capabilities, and support for pseudogene identification. We have benchmarked the resulting software tool kit, MAKER-P, using the Arabidopsis (Arabidopsis thaliana) and maize (Zea mays) genomes. Here, we demonstrate the ability of the MAKER-P tool kit to automatically update, extend, and revise the Arabidopsis annotations in light of newly available data and to annotate pseudogenes and noncoding RNAs absent from The Arabidopsis Informatics Resource 10 build. Our results demonstrate that MAKER-P can be used to manage and improve the annotations of even Arabidopsis, perhaps the best-annotated plant genome. We have also installed and benchmarked MAKER-P on the Texas Advanced Computing Center. We show that this public resource can de novo annotate the entire Arabidopsis and maize genomes in less than 3 h and produce annotations of comparable quality to those of the current The Arabidopsis Information Resource 10 and maize V2 annotation builds.
我们对广泛使用的注释引擎 MAKER 进行了优化和扩展,以更好地支持植物基因组注释工作。新功能包括更好地支持大型重复丰富的植物基因组的并行化、非编码 RNA 注释功能以及支持假基因鉴定。我们使用拟南芥(Arabidopsis thaliana)和玉米(Zea mays)基因组对生成的软件工具包 MAKER-P 进行了基准测试。在这里,我们展示了 MAKER-P 工具包根据新可用数据自动更新、扩展和修订拟南芥注释的能力,并注释了 The Arabidopsis Informatics Resource 10 构建中不存在的假基因和非编码 RNA。我们的结果表明,MAKER-P 甚至可以用于管理和改进拟南芥的注释,而拟南芥可能是注释最好的植物基因组。我们还在德克萨斯高级计算中心安装并对 MAKER-P 进行了基准测试。我们表明,该公共资源可以在不到 3 小时的时间内从头注释整个拟南芥和玉米基因组,并生成与当前的 The Arabidopsis Information Resource 10 和玉米 V2 注释版本相当的注释。