Du Chuance, Wu Xiaoyuan, Li Jia
Department of Urology, Ganzhou Hospital Affiliated to Nanchang University, Ganzhou, Jiangxi province China.
Department of Rehabilitation, Ganzhou Hospital Affiliated to Nanchang University, Nan Chang, Jiangxi province China.
Cancer Cell Int. 2016 Feb 9;16:2. doi: 10.1186/s12935-016-0278-5. eCollection 2015.
Mutation rates are consistently varied in cancer genome and play an important role in tumorigenesis, however, little has been known about their function potential and impact on the distribution of functional mutations. In this study, we investigated genomic features which affect mutation pattern and the function importance of mutation pattern in cancer.
Somatic mutations of clear-cell renal cell carcinoma, liver cancer, lung cancer and melanoma and single nucleotide polymorphisms (SNPs) were intersected with 54 distinct genomic features. Somatic mutation and SNP densities were then computed for each feature type. We constructed 2856 1-Mb windows, in which each row (1-Mb window) contains somatic mutation, SNP densities and 54 feature vectors. Correlation analyses were conducted between somatic mutation, SNP densities and each feature vector. We also built two random forest models, namely somatic mutation model (CSM) and SNP model to predict somatic mutation and SNP densities on a 1-Kb scale. The relation of CSM and SNP scores was further analyzed with the distributions of deleterious coding variants predicted by SIFT and Mutation Assessor, non-coding functional variants evaluated with FunSeq 2 and GWAVA and disease-causing variants from HGMD and ClinVar databases.
We observed a wide range of genomic features which affect local mutation rates, such as replication time, transcription levels, histone marks and regulatory elements. Repressive histone marks, replication time and promoter contributed most to the CSM models, while, recombination rate and chromatin organizations were most important for the SNP model. We showed low mutated regions preferentially have higher densities of deleterious coding mutations, higher average scores of non-coding variants, higher fraction of functional regions and higher enrichment of disease-causing variants as compared to high mutated regions.
Somatic mutation densities vary largely across cancer genome, mutation frequency is a major indication of function and influence on the distribution of functional mutations in cancer.
癌症基因组中的突变率持续变化,在肿瘤发生中起重要作用,然而,人们对其功能潜力以及对功能性突变分布的影响知之甚少。在本研究中,我们调查了影响突变模式的基因组特征以及癌症中突变模式的功能重要性。
将透明细胞肾细胞癌、肝癌、肺癌和黑色素瘤的体细胞突变及单核苷酸多态性(SNP)与54种不同的基因组特征进行交叉分析。然后计算每种特征类型的体细胞突变和SNP密度。我们构建了2856个1兆碱基窗口,其中每一行(1兆碱基窗口)包含体细胞突变、SNP密度和54个特征向量。对体细胞突变、SNP密度与每个特征向量进行相关性分析。我们还建立了两个随机森林模型,即体细胞突变模型(CSM)和SNP模型,以预测1千碱基规模上的体细胞突变和SNP密度。通过SIFT和Mutation Assessor预测的有害编码变体、用FunSeq 2和GWAVA评估的非编码功能变体以及来自HGMD和ClinVar数据库的致病变体的分布,进一步分析CSM和SNP分数之间的关系。
我们观察到多种影响局部突变率的基因组特征,如复制时间、转录水平、组蛋白标记和调控元件。抑制性组蛋白标记、复制时间和启动子对CSM模型贡献最大,而重组率和染色质组织对SNP模型最为重要。我们发现,与高突变区域相比,低突变区域有害编码突变的密度更高、非编码变体的平均分数更高、功能区域的比例更高且致病变体的富集程度更高。
体细胞突变密度在癌症基因组中差异很大,突变频率是功能的主要指标,且对癌症中功能性突变的分布有影响。