Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA, Division of Biomedical Statistics and Informatics, Mayo Clinic College of Medicine, Rochester, MN 55905, USA and Department of Biochemistry and Molecular Biology, Mayo Clinic College of Medicine, Rochester, MN 55905, USA.
Bioinformatics. 2014 Apr 1;30(7):1006-7. doi: 10.1093/bioinformatics/btt730. Epub 2013 Dec 18.
Reference genome assemblies are subject to change and refinement from time to time. Generally, researchers need to convert the results that have been analyzed according to old assemblies to newer versions, or vice versa, to facilitate meta-analysis, direct comparison, data integration and visualization. Several useful conversion tools can convert genome interval files in browser extensible data or general feature format, but none have the functionality to convert files in sequence alignment map or BigWig format. This is a significant gap in computational genomics tools, as these formats are the ones most widely used for representing high-throughput sequencing data, such as RNA-seq, chromatin immunoprecipitation sequencing, DNA-seq, etc.
Here we developed CrossMap, a versatile and efficient tool for converting genome coordinates between assemblies. CrossMap supports most of the commonly used file formats, including BAM, sequence alignment map, Wiggle, BigWig, browser extensible data, general feature format, gene transfer format and variant call format.
CrossMap is written in Python and C. Source code and a comprehensive user's manual are freely available at: http://crossmap.sourceforge.net/.
参考基因组组装会不时地进行更改和完善。通常,研究人员需要将根据旧组装进行分析的结果转换为新版本,或者反之亦然,以促进元分析、直接比较、数据集成和可视化。有几个有用的转换工具可以转换浏览器可扩展数据或通用特征格式中的基因组区间文件,但没有一个工具具有转换序列比对图或 BigWig 格式文件的功能。这是计算基因组学工具中的一个重大空白,因为这些格式是最常用于表示高通量测序数据的格式,例如 RNA-seq、染色质免疫沉淀测序、DNA-seq 等。
在这里,我们开发了 CrossMap,这是一种用于在组装之间转换基因组坐标的通用且高效的工具。CrossMap 支持大多数常用的文件格式,包括 BAM、序列比对图、Wiggle、BigWig、浏览器可扩展数据、通用特征格式、基因转移格式和变异调用格式。
CrossMap 是用 Python 和 C 编写的。源代码和全面的用户手册可在以下网址免费获得:http://crossmap.sourceforge.net/。