Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA.
McDonnell Genome Institute, Washington University School of Medicine, St. Louis, MO, USA.
Nat Commun. 2023 Mar 22;14(1):1589. doi: 10.1038/s41467-023-37266-6.
Somatic mutations within non-coding regions and even exons may have unidentified regulatory consequences that are often overlooked in analysis workflows. Here we present RegTools ( www.regtools.org ), a computationally efficient, free, and open-source software package designed to integrate somatic variants from genomic data with splice junctions from bulk or single cell transcriptomic data to identify variants that may cause aberrant splicing. We apply RegTools to over 9000 tumor samples with both tumor DNA and RNA sequence data. RegTools discovers 235,778 events where a splice-associated variant significantly increases the splicing of a particular junction, across 158,200 unique variants and 131,212 unique junctions. To characterize these somatic variants and their associated splice isoforms, we annotate them with the Variant Effect Predictor, SpliceAI, and Genotype-Tissue Expression junction counts and compare our results to other tools that integrate genomic and transcriptomic data. While many events are corroborated by the aforementioned tools, the flexibility of RegTools also allows us to identify splice-associated variants in known cancer drivers, such as TP53, CDKN2A, and B2M, and other genes.
非编码区域甚至外显子内的体细胞突变可能具有未知的调控后果,而这些后果在分析工作流程中经常被忽视。在这里,我们介绍 RegTools(www.regtools.org),这是一个计算效率高、免费且开源的软件包,旨在将基因组数据中的体细胞变异与批量或单细胞转录组数据中的剪接接头整合,以识别可能导致异常剪接的变异。我们将 RegTools 应用于超过 9000 个既有肿瘤 DNA 又有 RNA 序列数据的肿瘤样本。RegTools 在 131,212 个独特的接头和 158,200 个独特的变体中发现了 235,778 个与剪接相关的变体显著增加特定接头剪接的事件。为了描述这些体细胞变体及其相关的剪接异构体,我们使用 Variant Effect Predictor、SpliceAI 和 Genotype-Tissue Expression 对它们进行注释,并将结果与整合基因组和转录组数据的其他工具进行比较。虽然许多事件都得到了上述工具的证实,但 RegTools 的灵活性还使我们能够识别已知癌症驱动基因(如 TP53、CDKN2A 和 B2M)和其他基因中的与剪接相关的变体。