Okonechnikov Konstantin, Imai-Matsushima Aki, Paul Lukas, Seitz Alexander, Meyer Thomas F, Garcia-Alcalde Fernando
Department of Molecular Biology, Max Planck Institute for Infection Biology, Berlin, Germany.
Lexogen GmbH, Campus Vienna Biocenter 5, Vienna, Austria.
PLoS One. 2016 Dec 1;11(12):e0167417. doi: 10.1371/journal.pone.0167417. eCollection 2016.
Analysis of fusion transcripts has become increasingly important due to their link with cancer development. Since high-throughput sequencing approaches survey fusion events exhaustively, several computational methods for the detection of gene fusions from RNA-seq data have been developed. This kind of analysis, however, is complicated by native trans-splicing events, the splicing-induced complexity of the transcriptome and biases and artefacts introduced in experiments and data analysis. There are a number of tools available for the detection of fusions from RNA-seq data; however, certain differences in specificity and sensitivity between commonly used approaches have been found. The ability to detect gene fusions of different types, including isoform fusions and fusions involving non-coding regions, has not been thoroughly studied yet. Here, we propose a novel computational toolkit called InFusion for fusion gene detection from RNA-seq data. InFusion introduces several unique features, such as discovery of fusions involving intergenic regions, and detection of anti-sense transcription in chimeric RNAs based on strand-specificity. Our approach demonstrates superior detection accuracy on simulated data and several public RNA-seq datasets. This improved performance was also evident when evaluating data from RNA deep-sequencing of two well-established prostate cancer cell lines. InFusion identified 26 novel fusion events that were validated in vitro, including alternatively spliced gene fusion isoforms and chimeric transcripts that include intergenic regions. The toolkit is freely available to download from http:/bitbucket.org/kokonech/infusion.
由于融合转录本与癌症发展相关,对其进行分析变得越来越重要。由于高通量测序方法能详尽地检测融合事件,因此已开发出多种从RNA测序数据中检测基因融合的计算方法。然而,这种分析因天然反式剪接事件、转录组剪接诱导的复杂性以及实验和数据分析中引入的偏差和假象而变得复杂。有许多工具可用于从RNA测序数据中检测融合;然而,已发现常用方法在特异性和灵敏度方面存在一定差异。检测不同类型基因融合的能力,包括异构体融合和涉及非编码区的融合,尚未得到充分研究。在此,我们提出一种名为InFusion的新型计算工具包,用于从RNA测序数据中检测融合基因。InFusion引入了几个独特的功能,例如发现涉及基因间区域的融合,以及基于链特异性检测嵌合RNA中的反义转录。我们的方法在模拟数据和几个公共RNA测序数据集上展示了卓越的检测准确性。在评估来自两种成熟前列腺癌细胞系的RNA深度测序数据时,这种改进的性能也很明显。InFusion鉴定出26个在体外得到验证的新型融合事件,包括可变剪接的基因融合异构体和包含基因间区域的嵌合转录本。该工具包可从http:/bitbucket.org/kokonech/infusion免费下载。