Deng Wenjiang, Mou Tian, Pawitan Yudi, Vu Trung Nghia
Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, Stockholm, Sweden.
School of Biomedical Engineering, Shenzhen University, Shenzhen, China.
NAR Genom Bioinform. 2022 Jul 13;4(3):lqac052. doi: 10.1093/nargab/lqac052. eCollection 2022 Sep.
Even though the role of DNA mutations in cancer is well recognized, current quantification of the RNA expression, performed either at gene or isoform level, typically ignores the mutation status. Standard methods for estimating allele-specific expression (ASE) consider gene-level expression, but the functional impact of a mutation is best assessed at isoform level. Hence our goal is to quantify the mutant-allele expression at isoform level. We have developed and implemented a method, named MAX, for quantifying mutant-allele expression given a list of mutations. For a gene of interest, a mutant reference is constructed by incorporating all possible mutant versions of the wild-type isoforms in the transcriptome annotation. The mutant reference is then used for the RNA-seq reads mapping, which in principle works similarly for any quantification tool. We apply an alternating EM algorithm to the read-count data from the mapping step. In a simulation study, MAX performs well against standard isoform-quantification methods. Also, MAX achieves higher accuracy than conventional gene-based ASE methods such as ASEP. An analysis of a real dataset of acute myeloid leukemia reveals a subgroup of NPM1-mutated patients responding well to a kinase inhibitor. Our findings indicate that quantification of mutant-allele expression at isoform level is feasible and has potential added values for assessing the functional impact of DNA mutations in cancers.
尽管DNA突变在癌症中的作用已得到充分认识,但目前在基因或异构体水平进行的RNA表达定量通常忽略了突变状态。估计等位基因特异性表达(ASE)的标准方法考虑基因水平的表达,但突变的功能影响在异构体水平评估最佳。因此,我们的目标是在异构体水平量化突变等位基因的表达。我们开发并实施了一种名为MAX的方法,用于在给定突变列表的情况下量化突变等位基因的表达。对于感兴趣的基因,通过将转录组注释中野生型异构体的所有可能突变版本纳入来构建突变参考。然后将突变参考用于RNA-seq读段映射,原则上这对任何定量工具的工作方式类似。我们将交替期望最大化(EM)算法应用于映射步骤的读段计数数据。在一项模拟研究中,MAX与标准异构体定量方法相比表现良好。此外,MAX比传统的基于基因的ASE方法(如ASEP)具有更高的准确性。对急性髓系白血病真实数据集的分析揭示了一组对激酶抑制剂反应良好的NPM1突变患者。我们的研究结果表明,在异构体水平量化突变等位基因的表达是可行的,并且在评估癌症中DNA突变的功能影响方面具有潜在的附加价值。