Tso Kai-Yuen, Lee Sau Dan, Lo Kwok-Wai, Yip Kevin Y
Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong.
BMC Genomics. 2014 Dec 23;15(1):1172. doi: 10.1186/1471-2164-15-1172.
Patient-derived tumor xenografts in mice are widely used in cancer research and have become important in developing personalized therapies. When these xenografts are subject to DNA sequencing, the samples could contain various amounts of mouse DNA. It has been unclear how the mouse reads would affect data analyses. We conducted comprehensive simulations to compare three alignment strategies at different mutation rates, read lengths, sequencing error rates, human-mouse mixing ratios and sequenced regions. We also sequenced a nasopharyngeal carcinoma xenograft and a cell line to test how the strategies work on real data.
We found the "filtering" and "combined reference" strategies performed better than aligning reads directly to human reference in terms of alignment and variant calling accuracies. The combined reference strategy was particularly good at reducing false negative variants calls without significantly increasing the false positive rate. In some scenarios the performance gain of these two special handling strategies was too small for special handling to be cost-effective, but it was found crucial when false non-synonymous SNVs should be minimized, especially in exome sequencing.
Our study systematically analyzes the effects of mouse contamination in the sequencing data of human-in-mouse xenografts. Our findings provide information for designing data analysis pipelines for these data.
小鼠体内源自患者的肿瘤异种移植模型在癌症研究中被广泛应用,并且在开发个性化疗法方面发挥着重要作用。当对这些异种移植模型进行DNA测序时,样本中可能会含有不同量的小鼠DNA。目前尚不清楚小鼠来源的 reads 会如何影响数据分析。我们进行了全面的模拟,以比较三种比对策略在不同突变率、读长、测序错误率、人鼠混合比例和测序区域下的表现。我们还对一个鼻咽癌异种移植模型和一个细胞系进行了测序,以测试这些策略在实际数据中的效果。
我们发现,在比对和变异检测准确性方面,“过滤”和“组合参考”策略比直接将 reads 比对到人类参考基因组表现更好。组合参考策略在减少假阴性变异检测方面特别出色,同时不会显著增加假阳性率。在某些情况下,这两种特殊处理策略的性能提升过小,以至于特殊处理不具有成本效益,但当需要将错误的非同义单核苷酸变异(SNV)降至最低时,尤其是在外显子组测序中,这两种策略被证明至关重要。
我们的研究系统地分析了人源化小鼠异种移植模型测序数据中小鼠污染的影响。我们的研究结果为设计这些数据的分析流程提供了参考信息。