Doss C George Priya, Sethumadhavan Rao
Bioinformatics Division, School of Biotechnology, Chemical and Biomedical Engineering, Vellore Institute of Technology, Vellore 632014, Tamil Nadu, India.
J Biomed Sci. 2009 Apr 24;16(1):42. doi: 10.1186/1423-0127-16-42.
A central focus of cancer genetics is the study of mutations that are causally implicated in tumorigenesis. The identification of such causal mutations not only provides insight into cancer biology but also presents anticancer therapeutic targets and diagnostic markers. Missense mutations are nucleotide substitutions that change an amino acid in a protein, the deleterious effects of these mutations are commonly attributed to their impact on primary amino acid sequence and protein structure.
The method to identify functional SNPs from a pool, containing both functional and neutral SNPs is challenging by experimental protocols. To explore possible relationships between genetic mutation and phenotypic variation, we employed different bioinformatics algorithms like Sorting Intolerant from Tolerant (SIFT), Polymorphism Phenotyping (PolyPhen), and PupaSuite to predict the impact of these amino acid substitutions on protein activity of mismatch repair (MMR) genes causing hereditary nonpolyposis colorectal cancer (HNPCC).
SIFT classified 22 of 125 variants (18%) as 'Intolerant." PolyPhen classified 40 of 125 amino acid substitutions (32%) as "Probably or possibly damaging". The PupaSuite predicted the phenotypic effect of SNPs on the structure and function of the affected protein. Based on the PolyPhen scores and availability of three-dimensional structures, structure analysis was carried out with the major mutations that occurred in the native protein coded by MSH2 and MSH6 genes. The amino acid residues in the native and mutant model protein were further analyzed for solvent accessibility and secondary structure to check the stability of the proteins.
Based on this approach, we have shown that four nsSNPs, which were predicted to have functional consequences (MSH2-Y43C, MSH6-Y538S, MSH6-S580L, and MSH6-K854M), were already found to be associated with cancer risk. Our study demonstrates the presence of other deleterious mutations and also endorses with in vivo experimental studies.
癌症遗传学的一个核心焦点是研究与肿瘤发生有因果关系的突变。识别此类因果突变不仅能深入了解癌症生物学,还能提供抗癌治疗靶点和诊断标志物。错义突变是指核苷酸替换导致蛋白质中的氨基酸发生改变,这些突变的有害影响通常归因于它们对一级氨基酸序列和蛋白质结构的影响。
从一个包含功能性和中性单核苷酸多态性(SNP)的库中识别功能性SNP的方法,在实验方案方面具有挑战性。为了探索基因突变与表型变异之间的可能关系,我们采用了不同的生物信息学算法,如容忍与不容忍排序(SIFT)、多态性表型分析(PolyPhen)和PupaSuite,来预测这些氨基酸替换对导致遗传性非息肉病性结直肠癌(HNPCC)的错配修复(MMR)基因蛋白质活性的影响。
SIFT将125个变异中的22个(18%)分类为“不容忍”。PolyPhen将125个氨基酸替换中的40个(32%)分类为“可能或可能有害”。PupaSuite预测了SNP对受影响蛋白质结构和功能的表型效应。基于PolyPhen评分和三维结构的可用性,对由MSH2和MSH6基因编码的天然蛋白质中发生的主要突变进行了结构分析。进一步分析天然和突变模型蛋白质中的氨基酸残基的溶剂可及性和二级结构,以检查蛋白质的稳定性。
基于这种方法,我们已经表明,四个被预测具有功能后果的非同义单核苷酸多态性(nsSNP)(MSH2-Y43C、MSH6-Y538S、MSH6-S580L和MSH6-K854M)已被发现与癌症风险相关。我们的研究证明了其他有害突变的存在,也得到了体内实验研究的支持。