Verily Life Sciences, Mountain View, CA, USA.
Google Inc., Mountain View, CA, USA.
Bioinformatics. 2019 Nov 1;35(21):4389-4391. doi: 10.1093/bioinformatics/btz218.
Reference genomes are refined to reflect error corrections and other improvements. While this process improves novel data generation and analysis, incorporating data analyzed on an older reference genome assembly requires transforming the coordinates and representations of the data to the new assembly. Multiple tools exist to perform this transformation for coordinate-only data types, but none supports accurate transformation of genome-wide short variation. Here we present GenomeWarp, a tool for efficiently transforming variants between genome assemblies. GenomeWarp transforms regions and short variants in a conservative manner to minimize false positive and negative variants in the target genome, and converts over 99% of regions and short variants from a representative human genome.
GenomeWarp is written in Java. All source code and the user manual are freely available at https://github.com/verilylifesciences/genomewarp.
Supplementary data are available at Bioinformatics online.
参考基因组经过优化以反映错误纠正和其他改进。虽然这一过程提高了新数据的生成和分析能力,但要纳入在旧参考基因组组装上分析的数据,则需要将数据的坐标和表示转换为新的组装。有多种工具可用于仅对坐标数据类型执行此转换,但没有一种工具支持对全基因组短变异进行准确转换。在这里,我们介绍了 GenomeWarp,这是一种用于在基因组组装之间高效转换变体的工具。GenomeWarp 以保守的方式转换区域和短变异,以最大限度地减少目标基因组中假阳性和假阴性变异,并将代表性人类基因组中的 99%以上的区域和短变异进行转换。
GenomeWarp 是用 Java 编写的。所有源代码和用户手册均可在 https://github.com/verilylifesciences/genomewarp 上免费获得。
补充数据可在生物信息学在线获得。