Suppr超能文献

机器学习方法用于非酒精性脂肪性肝炎易感性估计。

A machine-learning approach for nonalcoholic steatohepatitis susceptibility estimation.

机构信息

Department of Computer Engineering, Istanbul University Cerrahpaşa, 34320, Istanbul, Turkey.

Computer Programming, Vocational School, Nişantaşı University, 1453, Istanbul, Turkey.

出版信息

Indian J Gastroenterol. 2022 Oct;41(5):475-482. doi: 10.1007/s12664-022-01263-2. Epub 2022 Nov 11.

Abstract

BACKGROUND

Nonalcoholic steatohepatitis (NASH), a severe form of nonalcoholic fatty liver disease, can lead to advanced liver damage and has become an increasingly prominent health problem worldwide. Predictive models for early identification of high-risk individuals could help identify preventive and interventional measures. Traditional epidemiological models with limited predictive power are based on statistical analysis. In the current study, a novel machine-learning approach was developed for individual NASH susceptibility prediction using candidate single nucleotide polymorphisms (SNPs).

METHODS

A total of 245 NASH patients and 120 healthy individuals were included in the study. Single nucleotide polymorphism genotypes of candidate genes including two SNPs in the cytochrome P450 family 2 subfamily E member 1 (CYP2E1) gene (rs6413432, rs3813867), two SNPs in the glucokinase regulator (GCKR) gene (rs780094, rs1260326), rs738409 SNP in patatin-like phospholipase domain-containing 3 (PNPLA3), and gender parameters were used to develop models for identifying at-risk individuals. To predict the individual's susceptibility to NASH, nine different machine-learning models were constructed. These models involved two different feature selections including Chi-square, and support vector machine recursive feature elimination (SVM-RFE) and three classification algorithms including k-nearest neighbor (KNN), multi-layer perceptron (MLP), and random forest (RF). All nine machine-learning models were trained using 80% of both the NASH patients and the healthy controls data. The nine machine-learning models were then tested on 20% of both groups. The model's performance was compared for model accuracy, precision, sensitivity, and F measure.

RESULTS

Among all nine machine-learning models, the KNN classifier with all features as input showed the highest performance with 86% F measure and 79% accuracy.

CONCLUSIONS

Machine learning based on genomic variety may be applicable for estimating an individual's susceptibility for developing NASH among high-risk groups with a high degree of accuracy, precision, and sensitivity.

摘要

背景

非酒精性脂肪性肝炎(NASH)是一种严重的非酒精性脂肪肝疾病,可导致严重的肝损伤,已成为全球日益突出的健康问题。预测模型可以帮助识别高危个体,从而采取预防和干预措施。传统的预测能力有限的流行病学模型是基于统计分析的。本研究采用候选单核苷酸多态性(SNP),开发了一种新的机器学习方法来进行个体 NASH 易感性预测。

方法

共纳入 245 例 NASH 患者和 120 例健康对照者。采用候选基因单核苷酸多态性基因型,包括细胞色素 P450 家族 2 亚家族 E 成员 1(CYP2E1)基因的两个 SNP(rs6413432,rs3813867)、葡萄糖激酶调节因子(GCKR)基因的两个 SNP(rs780094,rs1260326)、patatin-like phospholipase domain-containing 3(PNPLA3)的 rs738409 多态性以及性别参数,建立识别高危个体的模型。为了预测个体患 NASH 的易感性,构建了 9 种不同的机器学习模型。这些模型涉及两种不同的特征选择,包括卡方检验,以及支持向量机递归特征消除(SVM-RFE)和三种分类算法,包括 k-最近邻(KNN)、多层感知机(MLP)和随机森林(RF)。所有 9 种机器学习模型均使用 80%的 NASH 患者和健康对照组数据进行训练。然后,使用 20%的两组数据对 9 种机器学习模型进行测试。比较了模型的准确性、精确性、敏感性和 F 度量,以评估模型的性能。

结果

在所有 9 种机器学习模型中,使用所有特征作为输入的 KNN 分类器表现最佳,F 度量为 86%,准确性为 79%。

结论

基于基因组多样性的机器学习方法可能适用于评估高危人群发生 NASH 的个体易感性,具有很高的准确性、精确性和敏感性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验