Carraro Marco, Minervini Giovanni, Giollo Manuel, Bromberg Yana, Capriotti Emidio, Casadio Rita, Dunbrack Roland, Elefanti Lisa, Fariselli Pietro, Ferrari Carlo, Gough Julian, Katsonis Panagiotis, Leonardi Emanuela, Lichtarge Olivier, Menin Chiara, Martelli Pier Luigi, Niroula Abhishek, Pal Lipika R, Repo Susanna, Scaini Maria Chiara, Vihinen Mauno, Wei Qiong, Xu Qifang, Yang Yuedong, Yin Yizhou, Zaucha Jan, Zhao Huiying, Zhou Yaoqi, Brenner Steven E, Moult John, Tosatto Silvio C E
Department of Biomedical Sciences, University of Padova, Padova, Italy.
Department of Information Engineering, University of Padova, Padova, Italy.
Hum Mutat. 2017 Sep;38(9):1042-1050. doi: 10.1002/humu.23235. Epub 2017 May 16.
Correct phenotypic interpretation of variants of unknown significance for cancer-associated genes is a diagnostic challenge as genetic screenings gain in popularity in the next-generation sequencing era. The Critical Assessment of Genome Interpretation (CAGI) experiment aims to test and define the state of the art of genotype-phenotype interpretation. Here, we present the assessment of the CAGI p16INK4a challenge. Participants were asked to predict the effect on cellular proliferation of 10 variants for the p16INK4a tumor suppressor, a cyclin-dependent kinase inhibitor encoded by the CDKN2A gene. Twenty-two pathogenicity predictors were assessed with a variety of accuracy measures for reliability in a medical context. Different assessment measures were combined in an overall ranking to provide more robust results. The R scripts used for assessment are publicly available from a GitHub repository for future use in similar assessment exercises. Despite a limited test-set size, our findings show a variety of results, with some methods performing significantly better. Methods combining different strategies frequently outperform simpler approaches. The best predictor, Yang&Zhou lab, uses a machine learning method combining an empirical energy function measuring protein stability with an evolutionary conservation term. The p16INK4a challenge highlights how subtle structural effects can neutralize otherwise deleterious variants.
在新一代测序时代,随着基因筛查越来越普及,对癌症相关基因意义未明的变异进行正确的表型解读是一项诊断挑战。基因组解读关键评估(CAGI)实验旨在测试和界定基因型-表型解读的技术水平。在此,我们展示对CAGI p16INK4a挑战的评估。参与者被要求预测10种p16INK4a肿瘤抑制因子变异对细胞增殖的影响,p16INK4a是一种由CDKN2A基因编码的细胞周期蛋白依赖性激酶抑制剂。使用多种准确性指标评估了22种致病性预测工具在医学背景下的可靠性。不同的评估指标被综合到一个总体排名中,以提供更可靠的结果。用于评估的R脚本可从GitHub仓库公开获取,供未来类似评估活动使用。尽管测试集规模有限,但我们的研究结果显示了各种各样的结果,一些方法表现明显更好。结合不同策略的方法通常优于更简单的方法。最佳预测工具Yang&Zhou实验室使用了一种机器学习方法,该方法将测量蛋白质稳定性的经验能量函数与进化保守性项相结合。p16INK4a挑战凸显了微妙的结构效应如何能够抵消原本有害的变异。