Suppr超能文献

使用仅肿瘤样本的 RNA-seq 来确定突变负担和特征。

Determining mutational burden and signature using RNA-seq from tumor-only samples.

机构信息

Mayo Clinic, 200 First Street SW, Rochester, MN, 55905, USA.

出版信息

BMC Med Genomics. 2021 Mar 1;14(1):65. doi: 10.1186/s12920-021-00898-y.

Abstract

BACKGROUND

Traditionally, mutational burden and mutational signatures have been assessed by tumor-normal pair DNA sequencing. The requirement of having both normal and tumor samples is not always feasible from a clinical perspective, and led us to investigate the efficacy of using RNA sequencing of only the tumor sample to determine the mutational burden and signatures, and subsequently molecular cause of the cancer. The potential advantages include reducing the cost of testing, and simultaneously providing information on the gene expression profile and gene fusions present in the tumor.

RESULTS

In this study, we devised supervised and unsupervised learning methods to determine mutational signatures from tumor RNA-seq data. As applications, we applied the methods to a training set of 587 TCGA uterine cancer RNA-seq samples, and examined in an independent testing set of 521 TCGA colorectal cancer RNA-seq samples. Both diseases are known associated with microsatellite instable high (MSI-H) and driver defects in DNA polymerase ɛ (POLɛ). From RNA-seq called variants, we found majority (> 95%) are likely germline variants, leading to C > T enriched germline variants (dbSNP) widely applicable in tumor and normal RNA-seq samples. We found significant associations between RNA-derived mutational burdens and MSI/POLɛ status, and insignificant relationship between RNA-seq total coverage and derived mutational burdens. Additionally we found that over 80% of variants could be explained by using the COSMIC mutational signature-5, -6 and -10, which are implicated in natural aging, MSI-H, and POLɛ, respectively. For classifying tumor type, within UCEC we achieved a recall of 0.56 and 0.78, and specificity of 0.66 and 0.99 for MSI and POLɛ respectively. By applying learnt RNA signatures from UCEC to COAD, we were able to improve our classification of both MSI and POLɛ.

CONCLUSIONS

Taken together, our work provides a novel method to detect RNA-seq derived mutational signatures with effective procedures to remove likely germline variants. It can leads to accurate classification of underlying driving mechanisms of DNA damage deficiency.

摘要

背景

传统上,通过肿瘤-正常对 DNA 测序评估突变负担和突变特征。从临床角度来看,并非总是可以同时获得正常和肿瘤样本,这促使我们研究仅使用肿瘤样本的 RNA 测序来确定突变负担和特征,以及随后确定癌症的分子原因的效果。潜在的优势包括降低测试成本,并同时提供肿瘤中存在的基因表达谱和基因融合信息。

结果

在这项研究中,我们设计了监督和无监督学习方法来从肿瘤 RNA-seq 数据中确定突变特征。作为应用,我们将这些方法应用于 587 个 TCGA 子宫癌 RNA-seq 样本的训练集,并在 521 个 TCGA 结直肠癌 RNA-seq 样本的独立测试集中进行了检查。这两种疾病都与微卫星不稳定高(MSI-H)和 DNA 聚合酶ɛ(POLɛ)的驱动缺陷相关。从 RNA-seq 调用的变体中,我们发现大多数(>95%)可能是种系变体,导致 C>T 富集的种系变体(dbSNP)广泛适用于肿瘤和正常 RNA-seq 样本。我们发现 RNA 衍生的突变负担与 MSI/POLɛ 状态之间存在显著关联,而 RNA-seq 总覆盖率与衍生的突变负担之间没有显著关系。此外,我们发现超过 80%的变体可以用 COSMIC 突变特征-5、-6 和-10 来解释,它们分别与自然衰老、MSI-H 和 POLɛ 有关。对于肿瘤类型的分类,在 UCEC 中,MSI 和 POLɛ 的召回率分别为 0.56 和 0.78,特异性分别为 0.66 和 0.99。通过将 UCEC 中学习到的 RNA 特征应用于 COAD,我们能够提高对 MSI 和 POLɛ 的分类准确性。

结论

总之,我们的工作提供了一种从 RNA-seq 中检测突变特征的新方法,具有有效去除可能的种系变体的步骤。它可以导致对 DNA 损伤缺陷的潜在驱动机制的准确分类。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07b3/7923324/229397309481/12920_2021_898_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验