The Milner Centre for Evolution, Department of Biology and Biochemistry, University of Bath, Bath, BA2 7AY, UK.
Institute for Genetics and Molecular Medicine, University of Edinburgh, Edinburgh, EH4 2XU, UK.
Genome Biol. 2017 Nov 7;18(1):213. doi: 10.1186/s13059-017-1337-5.
An important goal of cancer genomics is to identify systematically cancer-causing mutations. A common approach is to identify sites with high ratios of non-synonymous to synonymous mutations; however, if synonymous mutations are under purifying selection, this methodology leads to identification of false-positive mutations. Here, using synonymous somatic mutations (SSMs) identified in over 4000 tumours across 15 different cancer types, we sought to test this assumption by focusing on coding regions required for splicing.
Exon flanks, which are enriched for sequences required for splicing fidelity, have ~ 17% lower SSM density compared to exonic cores, even after excluding canonical splice sites. While it is impossible to eliminate a mutation bias of unknown cause, multiple lines of evidence support a purifying selection model above a mutational bias explanation. The flank/core difference is not explained by skewed nucleotide content, replication timing, nucleosome occupancy or deficiency in mismatch repair. The depletion is not seen in tumour suppressors, consistent with their role in positive tumour selection, but is otherwise observed in cancer-associated and non-cancer genes, both essential and non-essential. Consistent with a role in splicing modulation, exonic splice enhancers have a lower SSM density before and after controlling for nucleotide composition; moreover, flanks at the 5' end of the exons have significantly lower SSM density than at the 3' end.
These results suggest that the observable mutational spectrum of cancer genomes is not simply a product of various mutational processes and positive selection, but might also be shaped by negative selection.
癌症基因组学的一个重要目标是系统地识别致癌突变。一种常见的方法是识别非同义突变与同义突变比值较高的位点;然而,如果同义突变受到纯化选择的影响,这种方法会导致假阳性突变的识别。在这里,我们使用在超过 4000 个肿瘤中鉴定的同义体细胞突变(SSM),通过关注剪接所需的编码区域来检验这一假设。
exon flanks(exon 侧翼)富含剪接保真所需的序列,与exon cores(exon 核心)相比,SSM 密度低约 17%,即使排除了经典剪接位点。虽然无法消除未知原因的突变偏差,但多条证据支持纯化选择模型而非突变偏差解释。侧翼/核心差异不能用核苷酸含量偏斜、复制时相、核小体占有率或错配修复缺陷来解释。肿瘤抑制基因中没有这种耗尽现象,这与它们在正向肿瘤选择中的作用一致,但在其他癌症相关和非癌症基因中,无论是必需基因还是非必需基因,都观察到了这种现象。exon 剪接增强子在控制核苷酸组成前后的 SSM 密度较低,这与它们在剪接调节中的作用一致;此外,exon 5' 侧翼的 SSM 密度明显低于 3' 侧翼。
这些结果表明,可观察到的癌症基因组突变谱不仅仅是各种突变过程和正向选择的产物,还可能受到负向选择的影响。