Genomatix Software GmbH, Bayerstr. 85a, 80335 München, Germany.
Methods. 2013 Jan;59(1):S24-8. doi: 10.1016/j.ymeth.2012.09.013. Epub 2012 Oct 2.
In recent years, gene fusions have gained significant recognition as biomarkers. They can assist treatment decisions, are seldom found in normal tissue and are detectable through Next-generation sequencing (NGS) of the transcriptome (RNA-seq). To transform the data provided by the sequencer into robust gene fusion detection several analysis steps are needed. Usually the first step is to map the sequenced transcript fragments (RNA-seq) to a reference genome. One standard application of this approach is to estimate expression and detect variants within known genes, e.g. SNPs and indels. In case of gene fusions, however, completely novel gene structures have to be detected. Here, we describe the detection of such gene fusion events based on our comprehensive transcript annotation (ElDorado). To demonstrate the utility of our approach, we extract gene fusion candidates from eight breast cancer cell lines, which we compare to experimentally verified gene fusions. We discuss several gene fusion events, like BCAS3-BCAS4 that was only detected in the breast cancer cell line MCF7. As supporting evidence we show that gene fusions occur more frequently in copy number enriched regions (CNV analysis). In addition, we present the Transcriptome Viewer (TViewer) a tool that allows to interactively visualize gene fusions. Finally, we support detected gene fusions through literature mining based annotations and network analyses. In conclusion, we present a platform that allows detecting gene fusions and supporting them through literature knowledge as well as rich visualization capabilities. This enables scientists to better understand molecular processes, biological functions and disease associations, which will ultimately lead to better biomedical knowledge for the development of biomarkers for diagnostics and therapies.
近年来,基因融合已被广泛认可为生物标志物。它们可以辅助治疗决策,在正常组织中很少发现,并且可以通过转录组的下一代测序(NGS)(RNA-seq)进行检测。为了将测序仪提供的数据转化为可靠的基因融合检测,需要经过几个分析步骤。通常第一步是将测序的转录片段(RNA-seq)映射到参考基因组上。这种方法的一个标准应用是估计表达,并检测已知基因中的变体,例如 SNP 和 indel。然而,在基因融合的情况下,必须检测到全新的基因结构。在这里,我们基于全面的转录本注释(ElDorado)描述了这种基因融合事件的检测。为了证明我们方法的实用性,我们从八个乳腺癌细胞系中提取了基因融合候选物,并将其与实验验证的基因融合进行了比较。我们讨论了几个基因融合事件,例如在乳腺癌细胞系 MCF7 中仅检测到的 BCAS3-BCAS4。作为支持证据,我们表明基因融合更频繁地发生在拷贝数富集区域(CNV 分析)中。此外,我们还介绍了 Transcriptome Viewer(TViewer),这是一种允许交互式可视化基因融合的工具。最后,我们通过文献挖掘的注释和网络分析支持检测到的基因融合。总之,我们提出了一个平台,允许检测基因融合,并通过文献知识以及丰富的可视化功能为其提供支持。这使科学家能够更好地理解分子过程、生物学功能和疾病关联,从而为诊断和治疗的生物标志物开发带来更好的生物医学知识。