Faculty of Medicine, Department of Clinical Sciences Lund, Oncology, Lund University Cancer Centre, Lund, Sweden.
BMC Bioinformatics. 2023 Sep 23;24(1):359. doi: 10.1186/s12859-023-05489-5.
In cancer, genomic rearrangements can create fusion genes that either combine protein-coding sequences from two different partner genes or place one gene under the control of the promoter of another gene. These fusion genes can act as oncogenic drivers in tumor development and several fusions involving kinases have been successfully exploited as drug targets. Expressed fusions can be identified in RNA sequencing (RNA-Seq) data, but fusion prediction software often has a high fraction of false positive fusion transcript predictions. This is problematic for both research and clinical applications.
We describe a method for validation of fusion transcripts detected by RNA-Seq in matched whole-genome sequencing (WGS) data. Our pipeline uses discordant read pairs to identify supported fusion events and analyzes soft-clipped read alignments to determine genomic breakpoints. We have tested it on matched RNA-Seq and WGS data for both tumors and cancer cell lines and show that it can be used to validate both new predicted gene fusions and experimentally validated fusion events. It was considerably faster and more sensitive than using BreakDancer and Manta, software that is instead designed to detect many different types of structural variants on a genome-wide scale.
We have developed a fast and very sensitive pipeline for validation of gene fusions detected by RNA-Seq in matched WGS data. It can be used to identify high-quality gene fusions for further bioinformatic and experimental studies, including validation of genomic breakpoints and studies of the mechanisms that generate fusions. In a clinical setting, it could help find expressed gene fusions for personalized therapy.
在癌症中,基因组重排可以产生融合基因,这些融合基因要么将两个不同的伴侣基因的蛋白编码序列结合在一起,要么将一个基因置于另一个基因启动子的控制之下。这些融合基因可以作为肿瘤发生的致癌驱动因素,并且已经成功地利用了几种涉及激酶的融合作为药物靶点。融合可以在 RNA 测序(RNA-Seq)数据中被识别,但融合预测软件通常会有很高比例的假阳性融合转录本预测。这对研究和临床应用都有问题。
我们描述了一种在匹配的全基因组测序(WGS)数据中验证 RNA-Seq 检测到的融合转录本的方法。我们的管道使用不一致的读对来识别支持的融合事件,并分析软剪接读对齐以确定基因组断点。我们已经在肿瘤和癌细胞系的匹配 RNA-Seq 和 WGS 数据上对其进行了测试,表明它可以用于验证新预测的基因融合和实验验证的融合事件。与专门用于在全基因组范围内检测多种不同类型结构变体的 BreakDancer 和 Manta 软件相比,它的速度更快,灵敏度更高。
我们开发了一种快速而非常敏感的管道,用于验证匹配的 WGS 数据中 RNA-Seq 检测到的基因融合。它可用于鉴定高质量的基因融合,以进行进一步的生物信息学和实验研究,包括基因组断点的验证和产生融合的机制的研究。在临床环境中,它可以帮助找到表达的基因融合,以进行个性化治疗。