Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA.
BMC Bioinformatics. 2014 Jul 3;15:231. doi: 10.1186/1471-2105-15-231.
Current research suggests that a small set of "driver" mutations are responsible for tumorigenesis while a larger body of "passenger" mutations occur in the tumor but do not progress the disease. Due to recent pharmacological successes in treating cancers caused by driver mutations, a variety of methodologies that attempt to identify such mutations have been developed. Based on the hypothesis that driver mutations tend to cluster in key regions of the protein, the development of cluster identification algorithms has become critical.
We have developed a novel methodology, SpacePAC (Spatial Protein Amino acid Clustering), that identifies mutational clustering by considering the protein tertiary structure directly in 3D space. By combining the mutational data in the Catalogue of Somatic Mutations in Cancer (COSMIC) and the spatial information in the Protein Data Bank (PDB), SpacePAC is able to identify novel mutation clusters in many proteins such as FGFR3 and CHRM2. In addition, SpacePAC is better able to localize the most significant mutational hotspots as demonstrated in the cases of BRAF and ALK. The R package is available on Bioconductor at: http://www.bioconductor.org/packages/release/bioc/html/SpacePAC.html.
SpacePAC adds a valuable tool to the identification of mutational clusters while considering protein tertiary structure.
目前的研究表明,一小部分“驱动”突变负责肿瘤发生,而大量的“乘客”突变发生在肿瘤中,但不会使疾病进展。由于最近在治疗由驱动突变引起的癌症方面取得了药理学上的成功,因此已经开发出了各种试图识别这些突变的方法。基于驱动突变倾向于聚集在蛋白质关键区域的假设,聚类识别算法的开发变得至关重要。
我们开发了一种新的方法,即 SpacePAC(空间蛋白质氨基酸聚类),该方法通过直接在 3D 空间中考虑蛋白质的三级结构来识别突变聚类。通过将癌症体细胞突变目录(COSMIC)中的突变数据与蛋白质数据库(PDB)中的空间信息相结合,SpacePAC 能够在 FGFR3 和 CHRM2 等许多蛋白质中识别新的突变簇。此外,SpacePAC 能够更好地定位最显著的突变热点,如 BRAF 和 ALK 案例所示。R 包可在 Bioconductor 上获得:http://www.bioconductor.org/packages/release/bioc/html/SpacePAC.html。
SpacePAC 在考虑蛋白质三级结构的同时,为识别突变簇提供了一个有价值的工具。