• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ChimPipe:从RNA测序数据中准确检测融合基因和转录诱导嵌合体。

ChimPipe: accurate detection of fusion genes and transcription-induced chimeras from RNA-seq data.

作者信息

Rodríguez-Martín Bernardo, Palumbo Emilio, Marco-Sola Santiago, Griebel Thasso, Ribeca Paolo, Alonso Graciela, Rastrojo Alberto, Aguado Begoña, Guigó Roderic, Djebali Sarah

机构信息

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Dr. Aiguader 88, Barcelona, 08003, Spain.

Universitat Pompeu Fabra (UPF), Barcelona, Spain.

出版信息

BMC Genomics. 2017 Jan 3;18(1):7. doi: 10.1186/s12864-016-3404-9.

DOI:10.1186/s12864-016-3404-9
PMID:28049418
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5209911/
Abstract

BACKGROUND

Chimeric transcripts are commonly defined as transcripts linking two or more different genes in the genome, and can be explained by various biological mechanisms such as genomic rearrangement, read-through or trans-splicing, but also by technical or biological artefacts. Several studies have shown their importance in cancer, cell pluripotency and motility. Many programs have recently been developed to identify chimeras from Illumina RNA-seq data (mostly fusion genes in cancer). However outputs of different programs on the same dataset can be widely inconsistent, and tend to include many false positives. Other issues relate to simulated datasets restricted to fusion genes, real datasets with limited numbers of validated cases, result inconsistencies between simulated and real datasets, and gene rather than junction level assessment.

RESULTS

Here we present ChimPipe, a modular and easy-to-use method to reliably identify fusion genes and transcription-induced chimeras from paired-end Illumina RNA-seq data. We have also produced realistic simulated datasets for three different read lengths, and enhanced two gold-standard cancer datasets by associating exact junction points to validated gene fusions. Benchmarking ChimPipe together with four other state-of-the-art tools on this data showed ChimPipe to be the top program at identifying exact junction coordinates for both kinds of datasets, and the one showing the best trade-off between sensitivity and precision. Applied to 106 ENCODE human RNA-seq datasets, ChimPipe identified 137 high confidence chimeras connecting the protein coding sequence of their parent genes. In subsequent experiments, three out of four predicted chimeras, two of which recurrently expressed in a large majority of the samples, could be validated. Cloning and sequencing of the three cases revealed several new chimeric transcript structures, 3 of which with the potential to encode a chimeric protein for which we hypothesized a new role. Applying ChimPipe to human and mouse ENCODE RNA-seq data led to the identification of 131 recurrent chimeras common to both species, and therefore potentially conserved.

CONCLUSIONS

ChimPipe combines discordant paired-end reads and split-reads to detect any kind of chimeras, including those originating from polymerase read-through, and shows an excellent trade-off between sensitivity and precision. The chimeras found by ChimPipe can be validated in-vitro with high accuracy.

摘要

背景

嵌合转录本通常被定义为连接基因组中两个或多个不同基因的转录本,其产生可由多种生物学机制解释,如基因组重排、通读或反式剪接,也可能是技术或生物学假象导致的。多项研究表明它们在癌症、细胞多能性和运动性方面具有重要意义。最近已经开发了许多程序来从Illumina RNA测序数据中识别嵌合体(主要是癌症中的融合基因)。然而,不同程序对同一数据集的输出可能存在很大差异,并且往往包含许多假阳性结果。其他问题包括仅限于融合基因的模拟数据集、经过验证的病例数量有限的真实数据集、模拟数据集和真实数据集之间的结果不一致,以及基因水平而非连接点水平的评估。

结果

在此,我们介绍ChimPipe,一种模块化且易于使用的方法,可从双末端Illumina RNA测序数据中可靠地识别融合基因和转录诱导的嵌合体。我们还针对三种不同的读长生成了逼真的模拟数据集,并通过将精确的连接点与经过验证的基因融合相关联,增强了两个金标准癌症数据集。将ChimPipe与其他四个最先进的工具在此数据上进行基准测试,结果表明ChimPipe是在识别两种数据集的精确连接坐标方面表现最佳的程序,并且在灵敏度和精度之间表现出最佳的权衡。应用于106个ENCODE人类RNA测序数据集时,ChimPipe识别出137个连接其亲本基因蛋白质编码序列的高可信度嵌合体。在后续实验中,四个预测的嵌合体中有三个可以得到验证,其中两个在大多数样本中反复表达。对这三个案例进行克隆和测序揭示了几种新的嵌合转录本结构,其中3种有可能编码嵌合蛋白,我们对其功能提出了新的假设。将ChimPipe应用于人类和小鼠的ENCODE RNA测序数据,导致识别出131个两种物种共有的反复出现的嵌合体,因此可能是保守的。

结论

ChimPipe结合了不一致的双末端读段和分裂读段来检测任何类型的嵌合体,包括那些源自聚合酶通读产生的嵌合体,并且在灵敏度和精度之间表现出出色的权衡。ChimPipe发现的嵌合体可以在体外以高精度进行验证。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/be548e195907/12864_2016_3404_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/73e82e8c20b2/12864_2016_3404_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/2c07f1e07434/12864_2016_3404_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/7267b5e235e8/12864_2016_3404_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/33eb7178b96f/12864_2016_3404_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/78ce65c1600b/12864_2016_3404_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/be548e195907/12864_2016_3404_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/73e82e8c20b2/12864_2016_3404_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/2c07f1e07434/12864_2016_3404_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/7267b5e235e8/12864_2016_3404_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/33eb7178b96f/12864_2016_3404_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/78ce65c1600b/12864_2016_3404_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9fe3/5209911/be548e195907/12864_2016_3404_Fig6_HTML.jpg

相似文献

1
ChimPipe: accurate detection of fusion genes and transcription-induced chimeras from RNA-seq data.ChimPipe:从RNA测序数据中准确检测融合基因和转录诱导嵌合体。
BMC Genomics. 2017 Jan 3;18(1):7. doi: 10.1186/s12864-016-3404-9.
2
InFusion: Advancing Discovery of Fusion Genes and Chimeric Transcripts from Deep RNA-Sequencing Data.InFusion:从深度RNA测序数据中推进融合基因和嵌合转录本的发现
PLoS One. 2016 Dec 1;11(12):e0167417. doi: 10.1371/journal.pone.0167417. eCollection 2016.
3
Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data.读取-分割-运行:一种利用RNA测序数据识别全基因组非经典剪接区域的改进型生物信息学流程。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):503. doi: 10.1186/s12864-016-2896-7.
4
FusionMap: detecting fusion genes from next-generation sequencing data at base-pair resolution.FusionMap:从下一代测序数据中以碱基对分辨率检测融合基因。
Bioinformatics. 2011 Jul 15;27(14):1922-8. doi: 10.1093/bioinformatics/btr310. Epub 2011 May 18.
5
RNA-Seq Analysis to Detect Abnormal Fusion Transcripts Linked to Chromothripsis.用于检测与染色体碎裂相关的异常融合转录本的RNA测序分析
Methods Mol Biol. 2018;1769:133-156. doi: 10.1007/978-1-4939-7780-2_9.
6
State of art fusion-finder algorithms are suitable to detect transcription-induced chimeras in normal tissues?最先进的融合查找算法是否适合检测正常组织中转录诱导的嵌合体?
BMC Bioinformatics. 2013;14 Suppl 7(Suppl 7):S2. doi: 10.1186/1471-2105-14-S7-S2. Epub 2013 Apr 22.
7
Comprehensive evaluation of fusion transcript detection algorithms and a meta-caller to combine top performing methods in paired-end RNA-seq data.融合转录本检测算法的综合评估以及一种元调用程序,用于在双端RNA测序数据中结合性能最佳的方法。
Nucleic Acids Res. 2016 Mar 18;44(5):e47. doi: 10.1093/nar/gkv1234. Epub 2015 Nov 17.
8
FuGePrior: A novel gene fusion prioritization algorithm based on accurate fusion structure analysis in cancer RNA-seq samples.FuGePrior:一种基于癌症RNA测序样本中精确融合结构分析的新型基因融合优先级排序算法。
BMC Bioinformatics. 2017 Jan 23;18(1):58. doi: 10.1186/s12859-016-1450-6.
9
ChimeRScope: a novel alignment-free algorithm for fusion transcript prediction using paired-end RNA-Seq data.ChimeRScope:一种使用双端RNA测序数据进行融合转录本预测的新型无比对算法。
Nucleic Acids Res. 2017 Jul 27;45(13):e120. doi: 10.1093/nar/gkx315.
10
Bellerophontes: an RNA-Seq data analysis framework for chimeric transcripts discovery based on accurate fusion model.贝勒罗丰蒂斯:一种基于精确融合模型的嵌合转录本发现的 RNA-Seq 数据分析框架。
Bioinformatics. 2012 Aug 15;28(16):2114-21. doi: 10.1093/bioinformatics/bts334. Epub 2012 Jun 17.

引用本文的文献

1
Accurate fusion transcript identification from long- and short-read isoform sequencing at bulk or single-cell resolution.在批量或单细胞分辨率下,从长读长和短读长异构体测序中准确鉴定融合转录本。
Genome Res. 2025 Apr 14;35(4):967-986. doi: 10.1101/gr.279200.124.
2
Architects and Partners: The Dual Roles of Non-coding RNAs in Gene Fusion Events.架构师与合作伙伴:非编码RNA在基因融合事件中的双重作用
Methods Mol Biol. 2025;2883:231-255. doi: 10.1007/978-1-0716-4290-0_10.
3
RTCpredictor: identification of read-through chimeric RNAs from RNA sequencing data.

本文引用的文献

1
Recurrent chimeric fusion RNAs in non-cancer tissues and cells.非癌组织和细胞中的复发性嵌合融合RNA
Nucleic Acids Res. 2016 Apr 7;44(6):2859-72. doi: 10.1093/nar/gkw032. Epub 2016 Feb 2.
2
Pervasive transcription read-through promotes aberrant expression of oncogenes and RNA chimeras in renal carcinoma.广泛的转录通读促进肾癌中癌基因和RNA嵌合体的异常表达。
Elife. 2015 Nov 17;4:e09214. doi: 10.7554/eLife.09214.
3
The Phyre2 web portal for protein modeling, prediction and analysis.用于蛋白质建模、预测和分析的Phyre2网络门户。
RTCpredictor:从 RNA 测序数据中识别通读嵌合 RNA。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae251.
4
CTAT-LR-fusion: accurate fusion transcript identification from long and short read isoform sequencing at bulk or single cell resolution.CTAT-LR融合:从长读长和短读长异构体测序中以批量或单细胞分辨率准确鉴定融合转录本。
bioRxiv. 2024 Feb 28:2024.02.24.581862. doi: 10.1101/2024.02.24.581862.
5
WFA-GPU: gap-affine pairwise read-alignment using GPUs.WFA-GPU:基于 GPU 的缺口仿射两两序列比对
Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad701.
6
Integration of Genomic Sequencing Drives Therapeutic Targeting of PDGFRA in T-Cell Acute Lymphoblastic Leukemia/Lymphoblastic Lymphoma.基因组测序整合驱动 PDGFRA 治疗靶点在 T 细胞急性淋巴细胞白血病/淋巴母细胞淋巴瘤中的应用。
Clin Cancer Res. 2023 Nov 14;29(22):4613-4626. doi: 10.1158/1078-0432.CCR-22-2562.
7
Molecular profiling identifies targeted therapy opportunities in pediatric solid cancer.分子谱分析鉴定儿科实体瘤的靶向治疗机会。
Nat Med. 2022 Aug;28(8):1581-1589. doi: 10.1038/s41591-022-01856-6. Epub 2022 Jun 23.
8
From karyotypes to precision genomics in 9p deletion and duplication syndromes.从核型分析到9p缺失和重复综合征的精准基因组学
HGG Adv. 2021 Dec 24;3(1):100081. doi: 10.1016/j.xhgg.2021.100081. eCollection 2022 Jan 13.
9
The Fusion of and Arises from a -Splicing Event in Normal and Transformed Human Cells.和 的融合源自正常和转化的人类细胞中的 - 剪接事件。
Int J Mol Sci. 2021 Nov 10;22(22):12178. doi: 10.3390/ijms222212178.
10
Comparative Analysis of PacBio and Oxford Nanopore Sequencing Technologies for Transcriptomic Landscape Identification of .用于转录组图谱鉴定的PacBio和牛津纳米孔测序技术的比较分析
Life (Basel). 2021 Aug 23;11(8):862. doi: 10.3390/life11080862.
Nat Protoc. 2015 Jun;10(6):845-58. doi: 10.1038/nprot.2015.053. Epub 2015 May 7.
4
Functional characterization of BC039389-GATM and KLK4-KRSP1 chimeric read-through transcripts which are up-regulated in renal cell cancer.在肾细胞癌中上调的BC039389-GATM和KLK4-KRSP1嵌合通读转录本的功能特征
BMC Genomics. 2015 Mar 27;16(1):247. doi: 10.1186/s12864-015-1446-z.
5
Identification of novel fusion genes in lung cancer using breakpoint assembly of transcriptome sequencing data.利用转录组测序数据的断点组装鉴定肺癌中的新型融合基因。
Genome Biol. 2015 Jan 5;16(1):7. doi: 10.1186/s13059-014-0558-0.
6
Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression.来自多个小鼠组织的增强转录组图谱揭示了基因表达中的进化限制。
Nat Commun. 2015 Jan 13;6:5903. doi: 10.1038/ncomms6903.
7
A comparative encyclopedia of DNA elements in the mouse genome.小鼠基因组中DNA元件的比较百科全书。
Nature. 2014 Nov 20;515(7527):355-64. doi: 10.1038/nature13992.
8
Tandem RNA chimeras contribute to transcriptome diversity in human population and are associated with intronic genetic variants.串联RNA嵌合体有助于人类群体转录组多样性,并与内含子遗传变异相关。
PLoS One. 2014 Aug 18;9(8):e104567. doi: 10.1371/journal.pone.0104567. eCollection 2014.
9
PRADA: pipeline for RNA sequencing data analysis.PRADA:RNA 测序数据分析流水线。
Bioinformatics. 2014 Aug 1;30(15):2224-6. doi: 10.1093/bioinformatics/btu169. Epub 2014 Apr 1.
10
Transcriptome characterization by RNA sequencing identifies a major molecular and clinical subdivision in chronic lymphocytic leukemia.RNA 测序的转录组特征鉴定慢性淋巴细胞白血病的主要分子和临床亚类。
Genome Res. 2014 Feb;24(2):212-26. doi: 10.1101/gr.152132.112. Epub 2013 Nov 21.