Department of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health.
University of Arizona Cancer Center, University of Arizona, Tucson, AZ 85721, USA.
Bioinformatics. 2018 May 15;34(10):1713-1718. doi: 10.1093/bioinformatics/bty010.
Tumor genome sequencing offers great promise for guiding research and therapy, but spurious variant calls can arise from multiple sources. Mouse contamination can generate many spurious calls when sequencing patient-derived xenografts. Paralogous genome sequences can also generate spurious calls when sequencing any tumor. We developed a BLAST-based algorithm, Mouse And Paralog EXterminator (MAPEX), to identify and filter out spurious calls from both these sources.
When calling variants from xenografts, MAPEX has similar sensitivity and specificity to more complex algorithms. When applied to any tumor, MAPEX also automatically flags calls that potentially arise from paralogous sequences. Our implementation, mapexr, runs quickly and easily on a desktop computer. MAPEX is thus a useful addition to almost any pipeline for calling genetic variants in tumors.
The mapexr package for R is available at https://github.com/bmannakee/mapexr under the MIT license.
mannakee@email.arizona.edu or rgutenk@email.arizona.edu or eknudsen@email.arizona.edu.
Supplementary data are available at Bioinformatics online.
肿瘤基因组测序为指导研究和治疗提供了巨大的前景,但虚假的变异呼叫可能来自多个来源。当对患者来源的异种移植物进行测序时,鼠污染可能会产生许多虚假的呼叫。当对任何肿瘤进行测序时,同源基因序列也会产生虚假的呼叫。我们开发了一种基于 BLAST 的算法,即 Mouse And Paralog EXterminator(MAPEX),以识别和过滤来自这两个来源的虚假呼叫。
当从异种移植物中呼叫变异时,MAPEX 的敏感性和特异性与更复杂的算法相似。当应用于任何肿瘤时,MAPEX 还会自动标记可能来自同源基因序列的呼叫。我们的实现,mapexr,在台式计算机上快速且易于运行。因此,MAPEX 几乎是肿瘤中调用遗传变异的任何管道的有用补充。
用于 R 的 mapexr 包可在 https://github.com/bmannakee/mapexr 下获得,MIT 许可证。
mannakee@email.arizona.edu 或 rgutenk@email.arizona.edu 或 eknudsen@email.arizona.edu。
补充数据可在生物信息学在线获得。