Masuda Keigo, Sota Yoshiaki, Matsuda Hideo
Graduate School of Information Science and Technology, Osaka University, 565-0871 Suita, Osaka, Japan.
Graduate School of Medicine, Osaka University, 565-0871 Suita, Osaka, Japan.
Front Biosci (Landmark Ed). 2024 Dec 11;29(12):413. doi: 10.31083/j.fbl2912413.
Fusion genes are important biomarkers in cancer research because their expression can produce abnormal proteins with oncogenic properties. Long-read RNA sequencing (long-read RNA-seq), which can sequence full-length mRNA transcripts, facilitates the detection of such fusion genes. Several tools have been proposed for detecting fusion genes in long-read RNA-seq datasets derived from cancer cells. However, the high sequencing error rate in long-read RNA-seq makes fusion gene detection challenging.
To address this issue, additional steps were incorporated into the fusion detection tool to improve detection accuracy. These steps include anchoring breakpoints to exon boundaries, realigning unaligned regions, and clustering breakpoints. To evaluate the accuracy of our tool in detecting fusion genes, we compared its detection accuracy with two representative existing tools, JAFFAL and FusionSeeker.
Our tool outperformed the two existing tools in detecting fusion genes, as demonstrated in long-read RNA-seq datasets. We also identified potentially novel fusion genes consistently detected across multiple tools or datasets.
The application of our tool to the detection of fusion genes in long-read RNA-seq datasets from two different cancer cell lines demonstrated the detection effectiveness of this tool.
融合基因是癌症研究中的重要生物标志物,因为它们的表达可产生具有致癌特性的异常蛋白质。长读长RNA测序(long-read RNA-seq)能够对全长mRNA转录本进行测序,有助于此类融合基因的检测。已经提出了几种工具用于检测源自癌细胞的长读长RNA-seq数据集中的融合基因。然而,长读长RNA-seq中的高测序错误率使得融合基因检测具有挑战性。
为了解决这个问题,在融合检测工具中加入了额外的步骤以提高检测准确性。这些步骤包括将断点锚定到外显子边界、重新比对未比对区域以及对断点进行聚类。为了评估我们的工具在检测融合基因方面的准确性,我们将其检测准确性与两个具有代表性的现有工具JAFFAL和FusionSeeker进行了比较。
如在长读长RNA-seq数据集中所示,我们的工具在检测融合基因方面优于这两个现有工具。我们还鉴定出了在多个工具或数据集中一致检测到的潜在新型融合基因。
我们的工具应用于检测来自两种不同癌细胞系的长读长RNA-seq数据集中的融合基因,证明了该工具的检测有效性。