Gosalia Nehal, Economides Aris N, Dewey Frederick E, Balasubramanian Suganthi
Regeneron Genetics Center, Tarrytown, NY 10591, USA.
Regeneron Pharmaceuticals, Tarrytown, NY 10591, USA.
Nucleic Acids Res. 2017 Oct 13;45(18):10393-10402. doi: 10.1093/nar/gkx730.
Nonsynonymous single nucleotide variants (nsSNVs) constitute about 50% of known disease-causing mutations and understanding their functional impact is an area of active research. Existing algorithms predict pathogenicity of nsSNVs; however, they are unable to differentiate heterozygous, dominant disease-causing variants from heterozygous carrier variants that lead to disease only in the homozygous state. Here, we present MAPPIN (Method for Annotating, Predicting Pathogenicity, and mode of Inheritance for Nonsynonymous variants), a prediction method which utilizes a random forest algorithm to distinguish between nsSNVs with dominant, recessive, and benign effects. We apply MAPPIN to a set of Mendelian disease-causing mutations and accurately predict pathogenicity for all mutations. Furthermore, MAPPIN predicts mode of inheritance correctly for 70.3% of nsSNVs. MAPPIN also correctly predicts pathogenicity for 87.3% of mutations from the Deciphering Developmental Disorders Study with a 78.5% accuracy for mode of inheritance. When tested on a larger collection of mutations from the Human Gene Mutation Database, MAPPIN is able to significantly discriminate between mutations in known dominant and recessive genes. Finally, we demonstrate that MAPPIN outperforms CADD and Eigen in predicting disease inheritance modes for all validation datasets. To our knowledge, MAPPIN is the first nsSNV pathogenicity prediction algorithm that provides mode of inheritance predictions, adding another layer of information for variant prioritization.
非同义单核苷酸变异(nsSNV)约占已知致病突变的50%,了解它们的功能影响是一个活跃的研究领域。现有算法可预测nsSNV的致病性;然而,它们无法区分杂合的、显性致病变异与仅在纯合状态下才导致疾病的杂合携带者变异。在此,我们展示了MAPPIN(非同义变异的注释、致病性预测及遗传模式方法),这是一种利用随机森林算法区分具有显性、隐性和良性效应的nsSNV的预测方法。我们将MAPPIN应用于一组孟德尔致病突变,并准确预测了所有突变的致病性。此外,MAPPIN对70.3%的nsSNV正确预测了遗传模式。MAPPIN对来自发育障碍解读研究的87.3%的突变也正确预测了致病性,遗传模式预测准确率为78.5%。在对人类基因突变数据库中更大的突变集合进行测试时,MAPPIN能够显著区分已知显性和隐性基因中的突变。最后,我们证明在所有验证数据集中,MAPPIN在预测疾病遗传模式方面优于CADD和Eigen。据我们所知,MAPPIN是首个提供遗传模式预测的nsSNV致病性预测算法,为变异优先级排序增加了另一层信息。