Doctorado en Genética Humana, Centro Universitario de Ciencias de la Salud, Universidad de Guadalajara, C.P, 44340, Guadalajara, Jalisco, México.
División de Genética, Centro de Investigación Biomédica de Occidente, Instituto Mexicano del Seguro Social (IMSS), Jalisco, C.P, 44340, Guadalajara, Mexico.
BMC Bioinformatics. 2019 Jun 28;20(1):363. doi: 10.1186/s12859-019-2919-x.
BACKGROUND: Missense mutations in the first five exons of F9, which encodes factor FIX, represent 40% of all mutations that cause hemophilia B. To address the ongoing debate regarding in silico identification of disease-causing mutations at these exons, we analyzed 215 missense mutations from www.factorix.org using six in silico prediction tools, which are the most common used programs for analysis prediction of impact of mutations on the protein structure and function, with further advantage of using similar approaches. We developed different algorithms to integrate multiple predictions from such tools. In order to approach a structural analysis on FIX we performed a modeling of five selected pathogenic mutations. RESULTS: SIFT, PolyPhen-2 HumDiv, SNAP2, and MutationAssessor were the most successful in identifying true non-causative and causative mutations. A proposed function integrating these algorithms (wgP4) was the most sensitive (90.1%), specific (22.6%), and accurate (87%) than similar functions, and identified 187 variants as deleterious. Clinical phenotype was significantly associated with predicted causative mutations at all five exons. However, PolyPhen-2 HumDiv was more successful in linking clinical severity to specific exons, while functions that integrate 4-6 predictions were more successful in linking phenotype to genotypes at the light chain (exons 3-5). The most important value of integrating multiple predictions is the inclusion of scores derived from different approaches. Modeling of protein structure showed the effects of pathogenic nsSNPs on structure and function of FIX. CONCLUSIONS: A simple function that integrates information from different in silico programs yields the best prediction of mutated phenotypes. However, the specificity, sensitivity, and accuracy of genotype-phenotype predictions depend on specific characteristics of the protein domain and the disease of interest as we validated by the structural analysis of selected pathogenic F9 mutations. The proposed function integrating algorithm (wgP4) might be useful for the analysis of nsSNPs impact on other genes.
背景:因子 IX(FIX)编码基因 F9 的前五个外显子中的错义突变占导致乙型血友病突变的 40%。为了解决这些外显子中致病变异的计算机预测的持续争议,我们使用最常用于分析突变对蛋白质结构和功能影响的六种计算机预测工具,分析了来自 www.factorix.org 的 215 个错义突变,这些工具具有使用相似方法的进一步优势。我们开发了不同的算法来整合这些工具的多个预测结果。为了对 FIX 进行结构分析,我们对五个选定的致病性突变进行了建模。
结果:SIFT、PolyPhen-2 HumDiv、SNAP2 和 MutationAssessor 最成功地识别了真正的非致病和致病突变。一种整合这些算法的功能(wgP4)比类似的功能更敏感(90.1%)、更特异(22.6%)、更准确(87%),并鉴定了 187 个变异为有害。临床表型与所有五个外显子中的预测致病变异显著相关。然而,PolyPhen-2 HumDiv 更成功地将临床严重程度与特定外显子联系起来,而整合 4-6 个预测的功能则更成功地将表型与轻链(外显子 3-5)的基因型联系起来。整合多个预测的最重要价值是包括来自不同方法的得分。蛋白质结构建模显示了致病性 nsSNP 对 FIX 结构和功能的影响。
结论:整合来自不同计算机程序信息的简单功能可以对突变表型进行最佳预测。然而,基因型-表型预测的特异性、敏感性和准确性取决于蛋白质结构域和感兴趣疾病的特定特征,我们通过对选定的致病性 F9 突变的结构分析进行了验证。整合算法的建议功能(wgP4)可能对分析其他基因中 nsSNP 的影响有用。
BMC Bioinformatics. 2019-6-28
Clin Appl Thromb Hemost. 2018-7
Blood Coagul Fibrinolysis. 2020-3
Nucleic Acids Res. 2019-1-8
Orphanet J Rare Dis. 2017-7-4
Br J Haematol. 2017-11
Nature. 2016-11-30
Acta Crystallogr F Struct Biol Commun. 2016-2
Nat Protoc. 2015-12-3
BMC Genomics. 2015