Suppr超能文献

使用长读 CAGE 测序来描绘隐秘启动子衍生的转录本及其对免疫肽组的贡献。

Using long-read CAGE sequencing to profile cryptic-promoter-derived transcripts and their contribution to the immunopeptidome.

机构信息

Department of Genetics, Washington University School of Medicine, St. Louis, Missouri 63110, USA.

Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, Missouri 63110, USA.

出版信息

Genome Res. 2023 Dec 27;33(12):2143-2155. doi: 10.1101/gr.277061.122.

Abstract

Recent studies have shown that the noncoding genome can produce unannotated proteins as antigens that induce immune response. One major source of this activity is the aberrant epigenetic reactivation of transposable elements (TEs). In tumors, TEs often provide cryptic or alternate promoters, which can generate transcripts that encode tumor-specific unannotated proteins. Thus, TE-derived transcripts (TE transcripts) have the potential to produce tumor-specific, but recurrent, antigens shared among many tumors. Identification of TE-derived tumor antigens holds the promise to improve cancer immunotherapy approaches; however, current genomics and computational tools are not optimized for their detection. Here we combined CAGE technology with full-length long-read transcriptome sequencing (long-read CAGE, or LRCAGE) and developed a suite of computational tools to significantly improve immunopeptidome detection by incorporating TE and other tumor transcripts into the proteome database. By applying our methods to human lung cancer cell line H1299 data, we show that long-read technology significantly improves mapping of promoters with low mappability scores and that LRCAGE guarantees accurate construction of uncharacterized 5' transcript structure. Augmenting a reference proteome database with newly characterized transcripts enabled us to detect noncanonical antigens from HLA-pulldown LC-MS/MS data. Lastly, we show that epigenetic treatment increased the number of noncanonical antigens, particularly those encoded by TE transcripts, which might expand the pool of targetable antigens for cancers with low mutational burden.

摘要

最近的研究表明,非编码基因组可以产生未经注释的蛋白质作为抗原,诱导免疫反应。这种活性的一个主要来源是转座元件 (TEs) 的异常表观遗传重新激活。在肿瘤中,TEs 经常提供隐匿或替代启动子,这些启动子可以产生编码肿瘤特异性未注释蛋白质的转录本。因此,TE 衍生的转录本 (TE 转录本) 有可能产生肿瘤特异性但反复出现的抗原,这些抗原在许多肿瘤中共享。鉴定 TE 衍生的肿瘤抗原有望改善癌症免疫治疗方法;然而,目前的基因组学和计算工具并不是专门针对它们的检测而优化的。在这里,我们结合 CAGE 技术和全长长读转录组测序(长读 CAGE,或 LRCAGE),并开发了一整套计算工具,通过将 TE 和其他肿瘤转录本纳入蛋白质组数据库,显著提高免疫肽组的检测。通过将我们的方法应用于人类肺癌细胞系 H1299 的数据,我们表明长读技术显著提高了低可映射分数启动子的映射,并且 LRCAGE 保证了未表征的 5'转录本结构的准确构建。用新表征的转录本增强参考蛋白质组数据库使我们能够从 HLA 下拉 LC-MS/MS 数据中检测非规范抗原。最后,我们表明表观遗传治疗增加了非规范抗原的数量,特别是那些由 TE 转录本编码的抗原,这可能为低突变负担的癌症扩大了可靶向抗原的范围。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验