Kahles André, Behr Jonas, Rätsch Gunnar
Memorial Sloan Kettering Cancer Center, Computational Biology Center, 1275 York Avenue, New York, NY 10065, USA.
Bioinformatics. 2016 Mar 1;32(5):770-2. doi: 10.1093/bioinformatics/btv624. Epub 2015 Oct 30.
Mapping high-throughput sequencing data to a reference genome is an essential step for most analysis pipelines aiming at the computational analysis of genome and transcriptome sequencing data. Breaking ties between equally well mapping locations poses a severe problem not only during the alignment phase but also has significant impact on the results of downstream analyses. We present the multi-mapper resolution (MMR) tool that infers optimal mapping locations from the coverage density of other mapped reads.
Filtering alignments with MMR can significantly improve the performance of downstream analyses like transcript quantitation and differential testing. We illustrate that the accuracy (Spearman correlation) of transcript quantification increases by 15% when using reads of length 51. In addition, MMR decreases the alignment file sizes by more than 50%, and this leads to a reduced running time of the quantification tool. Our efficient implementation of the MMR algorithm is easily applicable as a post-processing step to existing alignment files in BAM format. Its complexity scales linearly with the number of alignments and requires no further inputs.
Open source code and documentation are available for download at http://github.com/ratschlab/mmr Comprehensive testing results and further information can be found at http://bioweb.me/mmr.
andre.kahles@ratschlab.org or gunnar.ratsch@ratschlab.org
Supplementary data are available at Bioinformatics online.
将高通量测序数据映射到参考基因组是大多数旨在对基因组和转录组测序数据进行计算分析的分析流程中的关键步骤。在同样良好的映射位置之间打破平局不仅在比对阶段会带来严重问题,而且对下游分析结果也有重大影响。我们提出了多映射器分辨率(MMR)工具,该工具可从其他映射 reads 的覆盖密度推断出最佳映射位置。
使用 MMR 过滤比对可以显著提高下游分析(如转录本定量和差异测试)的性能。我们表明,当使用长度为 51 的 reads 时,转录本定量的准确性(斯皮尔曼相关性)提高了 15%。此外,MMR 将比对文件大小减少了 50%以上,这导致定量工具的运行时间缩短。我们对 MMR 算法的高效实现很容易作为后处理步骤应用于现有的 BAM 格式比对文件。其复杂度与比对数量呈线性比例,且无需进一步输入。
开源代码和文档可在 http://github.com/ratschlab/mmr 下载。全面的测试结果和更多信息可在 http://bioweb.me/mmr 找到。
andre.kahles@ratschlab.org 或 gunnar.ratsch@ratschlab.org
补充数据可在《生物信息学》在线获取。