Kumar Runjun D, Swamidass S Joshua, Bose Ron
Division of Oncology, Department of Medicine, Washington University School of Medicine, St. Louis, Missouri, USA.
Computational and Systems Biology Program, Washington University in St. Louis, St. Louis, Missouri, USA.
Nat Genet. 2016 Oct;48(10):1288-94. doi: 10.1038/ng.3658. Epub 2016 Sep 12.
Methods are needed to reliably prioritize biologically active driver mutations over inactive passengers in high-throughput sequencing cancer data sets. We present ParsSNP, an unsupervised functional impact predictor that is guided by parsimony. ParsSNP uses an expectation-maximization framework to find mutations that explain tumor incidence broadly, without using predefined training labels that can introduce biases. We compare ParsSNP to five existing tools (CanDrA, CHASM, FATHMM Cancer, TransFIC, and Condel) across five distinct benchmarks. ParsSNP outperformed the existing tools in 24 of 25 comparisons. To investigate the real-world benefit of these improvements, we applied ParsSNP to an independent data set of 30 patients with diffuse-type gastric cancer. ParsSNP identified many known and likely driver mutations that other methods did not detect, including truncation mutations in known tumor suppressors and the recurrent driver substitution RHOA p.Tyr42Cys. In conclusion, ParsSNP uses an innovative, parsimony-based approach to prioritize cancer driver mutations and provides dramatic improvements over existing methods.
在高通量测序癌症数据集中,需要一些方法来可靠地将具有生物活性的驱动突变优先于无活性的乘客突变进行排序。我们提出了ParsSNP,一种由简约性指导的无监督功能影响预测器。ParsSNP使用期望最大化框架来寻找能够广泛解释肿瘤发生率的突变,而不使用可能引入偏差的预定义训练标签。我们在五个不同的基准上,将ParsSNP与五个现有工具(CanDrA、CHASM、FATHMM Cancer、TransFIC和Condel)进行了比较。在25次比较中的24次中,ParsSNP的表现优于现有工具。为了研究这些改进在现实世界中的益处,我们将ParsSNP应用于一个由30例弥漫型胃癌患者组成的独立数据集。ParsSNP识别出了许多其他方法未检测到的已知和可能的驱动突变,包括已知肿瘤抑制因子中的截断突变以及复发性驱动替代RHOA p.Tyr42Cys。总之,ParsSNP使用一种创新的、基于简约性的方法来对癌症驱动突变进行排序,并比现有方法有显著改进。