Lee Boram, Park Junseok, Voshall Adam, Maury Eduardo, Kang Yeeok, Kim Yoen Jeong, Lee Jin-Young, Shim Hye-Ran, Kim Hyo-Ju, Lee Jung-Woo, Jung Min-Hyeok, Kim Si-Cho, Chu Hoang Bao Khanh, Kim Da-Won, Kim Minjeong, Choi Eun-Ji, Hwang Ok Kyung, Lee Ho Won, Ha Kyungsoo, Choi Jung Kyoon, Kim Yongjoon, Choi Yoonjoo, Park Woong-Yang, Lee Eunjung Alice
Samsung Genome Institute, Samsung Medical Center, Seoul, Republic of Korea.
Department of Pathology and Translational Genomics, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Republic of Korea.
bioRxiv. 2023 Oct 19:2023.10.16.562422. doi: 10.1101/2023.10.16.562422.
Transposon-derived transcripts are abundant in RNA sequences, yet their landscape and function, especially for fusion transcripts derived from unannotated or somatically acquired transposons, remains underexplored. Here, we developed a new bioinformatic tool to detect transposon-fusion transcripts in RNA-sequencing data and performed a pan-cancer analysis of 10,257 cancer samples across 34 cancer types as well as 3,088 normal tissue samples. We identified 52,277 cancer-specific fusions with ~30 events per cancer and hotspot loci within transposons vulnerable to fusion formation. Exonization of intronic transposons was the most prevalent genic fusions, while somatic L1 insertions constituted a small fraction of cancer-specific fusions. Source L1s and HERVs, but not Alus showed decreased DNA methylation in cancer upon fusion formation. Overall cancer-specific L1 fusions were enriched in tumor suppressors while Alu fusions were enriched in oncogenes, including recurrent Alu fusions in predictive of patient survival. We also demonstrated that transposon-derived peptides triggered CD8+ T-cell activation to the extent comparable to EBV viruses. Our findings reveal distinct epigenetic and tumorigenic mechanisms underlying transposon fusions across different families and highlight transposons as novel therapeutic targets and the source of potent neoantigens.
转座子衍生的转录本在RNA序列中很丰富,但其全貌和功能,尤其是来自未注释或体细胞获得的转座子的融合转录本,仍未得到充分探索。在这里,我们开发了一种新的生物信息学工具来检测RNA测序数据中的转座子融合转录本,并对34种癌症类型的10257个癌症样本以及3088个正常组织样本进行了泛癌分析。我们鉴定出52277种癌症特异性融合,每种癌症约有30个事件,并且转座子内存在易发生融合形成的热点位点。内含子转座子的外显子化是最普遍的基因融合类型,而体细胞L1插入在癌症特异性融合中占比很小。融合形成后,癌症中来源的L1和HERV,但不包括Alu,其DNA甲基化水平降低。总体而言,癌症特异性L1融合在肿瘤抑制基因中富集,而Alu融合在癌基因中富集,包括与患者生存相关的复发性Alu融合。我们还证明,转座子衍生的肽触发CD8 + T细胞活化的程度与EBV病毒相当。我们的研究结果揭示了不同家族中转座子融合背后独特的表观遗传和肿瘤发生机制,并突出了转座子作为新的治疗靶点和强效新抗原来源的作用。