Chen Xi, Shi Xu, Shajahan Ayesha N, Hilakivi-Clarke Leena, Clarke Robert, Xuan Jianhua
Annu Int Conf IEEE Eng Med Biol Soc. 2014;2014:3937-40. doi: 10.1109/EMBC.2014.6944485.
High coverage whole genome DNA-sequencing enables identification of somatic structural variation (SSV) more evident in paired tumor and normal samples. Recent studies show that simultaneous analysis of paired samples provides a better resolution of SSV detection than subtracting shared SVs. However, available tools can neither identify all types of SSVs nor provide any rank information regarding their somatic features. In this paper, we have developed a Bayesian framework, by integrating read alignment information from both tumor and normal samples, called BSSV, to calculate the significance of each SSV. Tested by simulated data, the precision of BSSV is comparable to that of available tools and the false negative rate is significantly lowered. We have also applied this approach to The Cancer Genome Atlas breast cancer data for SSV detection. Many known breast cancer specific mutated genes like RAD51, BRIP1, ER, PGR and PTPRD have been successfully identified.
高覆盖度全基因组DNA测序能够在配对的肿瘤样本和正常样本中更明显地识别体细胞结构变异(SSV)。最近的研究表明,对配对样本进行同步分析比减去共享的SV能提供更好的SSV检测分辨率。然而,现有的工具既不能识别所有类型的SSV,也不能提供关于其体细胞特征的任何排序信息。在本文中,我们开发了一个贝叶斯框架,通过整合来自肿瘤样本和正常样本的读段比对信息,称为BSSV,来计算每个SSV的显著性。经模拟数据测试,BSSV的精度与现有工具相当,且假阴性率显著降低。我们还将此方法应用于癌症基因组图谱乳腺癌数据进行SSV检测。许多已知的乳腺癌特异性突变基因,如RAD51、BRIP1、ER、PGR和PTPRD已被成功识别。