Vázquez Miguel, Valencia Alfonso, Pons Tirso
Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), 28029 Madrid, Spain.
Bioinformatics. 2015 Jul 15;31(14):2397-9. doi: 10.1093/bioinformatics/btv142. Epub 2015 Mar 11.
The interpretation of cancer-related single-nucleotide variants (SNVs) considering the protein features they affect, such as known functional sites, protein-protein interfaces, or relation with already annotated mutations, might complement the annotation of genetic variants in the analysis of NGS data. Current tools that annotate mutations fall short on several aspects, including the ability to use protein structure information or the interpretation of mutations in protein complexes.
We present the Structure-PPi system for the comprehensive analysis of coding SNVs based on 3D protein structures of protein complexes. The 3D repository used, Interactome3D, includes experimental and modeled structures for proteins and protein-protein complexes. Structure-PPi annotates SNVs with features extracted from UniProt, InterPro, APPRIS, dbNSFP and COSMIC databases. We illustrate the usefulness of Structure-PPi with the interpretation of 1 027 122 non-synonymous SNVs from COSMIC and the 1000G Project that provides a collection of ∼172 700 SNVs mapped onto the protein 3D structure of 8726 human proteins (43.2% of the 20 214 SwissProt-curated proteins in UniProtKB release 2014_06) and protein-protein interfaces with potential functional implications.
Structure-PPi, along with a user manual and examples, isavailable at http://structureppi.bioinfo.cnio.es/Structure, the code for local installations at https://github.com/Rbbt-Workflows
考虑到癌症相关单核苷酸变异(SNV)所影响的蛋白质特征,如已知功能位点、蛋白质-蛋白质相互作用界面或与已注释突变的关系,对其进行解释可能会在NGS数据分析中补充遗传变异的注释。当前注释突变的工具在几个方面存在不足,包括使用蛋白质结构信息的能力或对蛋白质复合物中突变的解释。
我们提出了Structure-PPi系统,用于基于蛋白质复合物的三维蛋白质结构对编码SNV进行全面分析。所使用的三维数据库Interactome3D包括蛋白质和蛋白质-蛋白质复合物的实验结构和模型结构。Structure-PPi用从UniProt、InterPro、APPRIS、dbNSFP和COSMIC数据库中提取的特征对SNV进行注释。我们通过解释来自COSMIC的1027122个非同义SNV以及1000基因组计划(该计划提供了约172700个SNV的集合,这些SNV映射到8726个人类蛋白质的蛋白质三维结构上,占UniProtKB 2014_06版本中20214个SwissProt注释蛋白质的43.2%)以及具有潜在功能影响的蛋白质-蛋白质相互作用界面,说明了Structure-PPi的实用性。
Structure-PPi以及用户手册和示例可在http://structureppi.bioinfo.cnio.es/Structure获取,本地安装的代码可在https://github.com/Rbbt-Workflows获取。