Brueffer Christian, Saal Lao H
Division of Oncology and Pathology, Department of Clinical Sciences, Lund University Cancer Center, Lund University, Medicon Village Building 404-B2, Lund, 223 81, Sweden.
BMC Bioinformatics. 2016 May 4;17(1):199. doi: 10.1186/s12859-016-1058-x.
TopHat is a popular spliced junction mapper for RNA sequencing data, and writes files in the BAM format - the binary version of the Sequence Alignment/Map (SAM) format. BAM is the standard exchange format for aligned sequencing reads, thus correct format implementation is paramount for software interoperability and correct analysis. However, TopHat writes its unmapped reads in a way that is not compatible with other software that implements the SAM/BAM format.
We have developed TopHat-Recondition, a post-processor for TopHat unmapped reads that restores read information in the proper format. TopHat-Recondition thus enables downstream software to process the plethora of BAM files written by TopHat.
TopHat-Recondition can repair unmapped read files written by TopHat and is freely available under a 2-clause BSD license on GitHub: https://github.com/cbrueffer/tophat-recondition .
TopHat是一款用于RNA测序数据的流行剪接接头映射器,它以BAM格式写入文件——序列比对/映射(SAM)格式的二进制版本。BAM是比对后的测序读数的标准交换格式,因此正确的格式实现对于软件的互操作性和正确分析至关重要。然而,TopHat写入其未映射读数的方式与其他实现SAM/BAM格式的软件不兼容。
我们开发了TopHat-Recondition,这是一种用于TopHat未映射读数的后处理器,它以正确的格式恢复读数信息。因此,TopHat-Recondition使下游软件能够处理由TopHat写入的大量BAM文件。
TopHat-Recondition可以修复由TopHat写入的未映射读数文件,并且在GitHub上根据双条款BSD许可免费提供:https://github.com/cbrueffer/tophat-recondition 。