Department of Electrical Engineering and Computer Science, Case Western Reserve University, 10900 Euclid Ave,, Cleveland, OH, USA.
BMC Bioinformatics. 2013;14 Suppl 5(Suppl 5):S6. doi: 10.1186/1471-2105-14-S5-S6. Epub 2013 Apr 10.
Somatically-acquired translocations may serve as important markers for assessing the cause and nature of diseases like cancer. Algorithms to locate translocations may use next-generation sequencing (NGS) platform data. However, paired-end strategies do not accurately predict precise translocation breakpoints, and "split-read" methods may lose sensitivity if a translocation boundary is not captured by many sequenced reads. To address these challenges, we have developed "Bellerophon", a method that uses discordant read pairs to identify potential translocations, and subsequently uses "soft-clipped" reads to predict the location of the precise breakpoints. Furthermore, for each chimeric breakpoint, our method attempts to classify it as a participant in an unbalanced translocation, balanced translocation, or interchromosomal insertion.
We compared Bellerophon to four previously published algorithms for detecting structural variation (SV). Using two simulated datasets and two prostate cancer datasets, Bellerophon had overall better performance than the other methods. Furthermore, our method accurately predicted the presence of the interchromosomal insertions placed in our simulated dataset, which is an ability that the other SV prediction programs lack.
The combined use of paired reads and soft-clipped reads allows Bellerophon to detect interchromosomal breakpoints with high sensitivity, while also mitigating losses in specificity. This trend is seen across all datasets examined. Because it does not perform assembly on soft-clipped subreads, Bellerophon may be limited in experiments where sequence read lengths are short.
The program can be downloaded from http://cbc.case.edu/Bellerophon.
体细胞获得的易位可作为评估癌症等疾病病因和性质的重要标志物。定位易位的算法可能会使用下一代测序 (NGS) 平台数据。然而,双端策略不能准确预测精确的易位断点,并且如果易位边界没有被许多测序reads 捕获,“split-read”方法可能会失去敏感性。为了解决这些挑战,我们开发了“Bellerophon”,一种使用不一致的读对来识别潜在易位的方法,然后使用“软剪辑”reads 来预测精确断点的位置。此外,对于每个嵌合断点,我们的方法尝试将其分类为不平衡易位、平衡易位或染色体间插入的参与者。
我们将 Bellerophon 与之前用于检测结构变异 (SV) 的四种算法进行了比较。使用两个模拟数据集和两个前列腺癌数据集,Bellerophon 的整体性能优于其他方法。此外,我们的方法准确预测了放置在模拟数据集中的染色体间插入的存在,这是其他 SV 预测程序所缺乏的能力。
使用成对reads 和软剪辑reads 的组合,Bellerophon 可以以高灵敏度检测染色体间断点,同时减轻特异性损失。在所有检查的数据集上都可以看到这种趋势。由于它不在软剪辑子reads 上执行组装,因此 Bellerophon 可能在序列读长较短的实验中受到限制。
该程序可以从 http://cbc.case.edu/Bellerophon 下载。