Pellegrini Stefano, Dove-Estrella Olivia, Muiños Ferran, Lopez-Bigas Nuria, Gonzalez-Perez Abel
Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Baldiri Reixac 10, Barcelona 08028, Spain.
Centro de Investigación Biomédica en Red en Cáncer (CIBERONC), Instituto de Salud Carlos III, Madrid 28029, Spain.
Nucleic Acids Res. 2025 Aug 11;53(15). doi: 10.1093/nar/gkaf776.
Identifying the genes capable of driving tumorigenesis in different tissues is one of the central goals of cancer genomics. Computational methods that exploit different signals of positive selection in the pattern of mutations observed in genes across tumors have been developed to this end. One such signal of positive selection is clustering of mutations in areas of the three-dimensional (3D) structure of the protein above the expectation under neutrality. Methods that exploit this signal have been hindered by the paucity of proteins with experimentally solved 3D structures covering their entire sequence. Here, we present Oncodrive3D, a computational method that, by exploiting AlphaFold 2 structural models, extends the identification of proteins with significant mutational 3D clusters to the entire human proteome. Oncodrive3D shows sensitivity and specificity on par with state-of-the-art cancer driver gene identification methods exploiting mutational clustering and clearly outperforms them in computational efficiency. We demonstrate, through several examples, how significant mutational 3D clusters identified by Oncodrive3D in different known or potential cancer driver genes can reveal details about the mechanism of tumorigenesis in different cancer types and in clonal hematopoiesis.
识别能够驱动不同组织肿瘤发生的基因是癌症基因组学的核心目标之一。为此,已经开发了利用肿瘤中基因观察到的突变模式中的不同正选择信号的计算方法。这种正选择信号之一是蛋白质三维(3D)结构区域中的突变聚类,其超出了中性条件下的预期。利用该信号的方法受到具有覆盖其整个序列的实验解析3D结构的蛋白质稀缺的阻碍。在这里,我们提出了Oncodrive3D,这是一种计算方法,通过利用AlphaFold 2结构模型,将具有显著突变3D簇的蛋白质的识别扩展到整个人类蛋白质组。Oncodrive3D在敏感性和特异性方面与利用突变聚类的最先进癌症驱动基因识别方法相当,并且在计算效率上明显优于它们。我们通过几个例子证明了Oncodrive3D在不同已知或潜在癌症驱动基因中识别出的显著突变3D簇如何能够揭示不同癌症类型和克隆性造血中肿瘤发生机制的细节。