Suppr超能文献

机器学习在预测巴西人群中非综合征性口腔裂畸形的遗传风险中的应用。

Machine learning in prediction of genetic risk of nonsyndromic oral clefts in the Brazilian population.

机构信息

Department of Oral Diagnosis, School of Dentistry, University of Campinas, Piracicaba, São Paulo, CEP 13414-018, Brazil.

Post-Graduation Program in Rehabilitation Sciences, Hospital for Rehabilitation of Craniofacial Anomalies, University of São Paulo, Bauru, São Paulo, Brazil.

出版信息

Clin Oral Investig. 2021 Mar;25(3):1273-1280. doi: 10.1007/s00784-020-03433-y. Epub 2020 Jul 2.

Abstract

OBJECTIVES

Genetic variants in multiple genes and loci have been associated with the risk of nonsyndromic cleft lip with or without cleft palate (NSCL ± P). However, the estimation of risk remains challenge, because most of these variants are population-specific rendering the identification of the underlying genetic risk difficult. Herein we examined the use of machine learning network in previously reported single nucleotide polymorphisms (SNPs) to predict risk of NSCL ± P in the Brazilian population.

MATERIALS AND METHODS

Random forest and neural network methods were applied in 72 SNPs in a case-control sample composed by 722 NSCL ± P and 866 controls for discrimination of NSCL ± P risk. SNP-SNP interactions and functional annotation biological processes associated with the identified NSCL ± P risk genes were verified.

RESULTS

Supervised random forest decision trees revealed high scores of importance for the SNPs rs11717284 and rs1875735 in FGF12, rs41268753 in GRHL3, rs2236225 in MTHFD1, rs2274976 in MTHFR, rs2235371 and rs642961 in IRF6, rs17085106 in RHPN2, rs28372960 in TCOF1, rs7078160 in VAX1, rs10762573 and rs2131960 in VCL, and rs227731 in 17q22, with an accuracy of 99% and an error rate of approximately 3% to predict the risk of NSCL ± P. Those same 13 SNPs were considered the most important for the neural network to effectively predict NSCL ± P risk, with an overall accuracy of 94%. Multivariate regression model revealed significant interactions among all SNPs, with an exception of those in FGF12 and MTHFD1. The most significantly biological processes for selected genes were those involved in tissue and epithelium development; neural tube closure; and metabolism of methionine, folate, and homocysteine.

CONCLUSIONS

Our results provide novel clues for genetic mechanism studies of NSCL ± P and point out for a machine learning model composed by 13 SNPs that is capable of predicting NSCL ± P risk.

CLINICAL RELEVANCE

Although validation is necessary, this genetic panel can be useful in the near future to assist in NSCL ± P genetic counseling.

摘要

目的

多个基因和基因座的遗传变异与非综合征性唇裂伴或不伴腭裂(NSCL ± P)的风险相关。然而,风险的估计仍然具有挑战性,因为这些变异中的大多数是特定于人群的,这使得确定潜在的遗传风险变得困难。在此,我们检查了在巴西人群中使用机器学习网络对先前报道的单核苷酸多态性(SNP)进行分析,以预测 NSCL ± P 的风险。

材料和方法

随机森林和神经网络方法应用于由 722 例 NSCL ± P 和 866 例对照组成的病例对照样本中的 72 个 SNP,以区分 NSCL ± P 风险。验证了 SNP-SNP 相互作用和与鉴定出的 NSCL ± P 风险基因相关的功能注释生物过程。

结果

监督随机森林决策树揭示了 FGF12 中的 SNPs rs11717284 和 rs1875735、GRHL3 中的 rs41268753、MTHFD1 中的 rs2236225、MTHFR 中的 rs2274976、IRF6 中的 rs2235371 和 rs642961、RHPN2 中的 rs17085106、TCOF1 中的 rs28372960、VAX1 中的 rs7078160、VCL 中的 rs10762573 和 rs2131960 以及 17q22 中的 rs227731 的重要性评分很高,用于预测 NSCL ± P 风险的准确率为 99%,误差率约为 3%。同样的 13 个 SNP 被认为是神经网络有效预测 NSCL ± P 风险的最重要因素,整体准确率为 94%。多变量回归模型揭示了所有 SNP 之间存在显著的相互作用,除了 FGF12 和 MTHFD1 中的 SNP 之外。所选基因的最显著的生物学过程是那些涉及组织和上皮发育、神经管闭合以及蛋氨酸、叶酸和同型半胱氨酸代谢的过程。

结论

我们的结果为 NSCL ± P 的遗传机制研究提供了新的线索,并指出了由 13 个 SNP 组成的机器学习模型能够预测 NSCL ± P 的风险。

临床相关性

尽管需要验证,但该遗传面板在不久的将来可能有助于 NSCL ± P 的遗传咨询。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验