Li Zhehong, Wang Liang, Tian Chenxu, Wang Zheng, Zhao Hao, Qi Yao, Chen Weijian, Wuyun Qiqige, Amin Buhe, Lian Dongbo, Zhu Jinxia, Zhang Nengwei, Zheng Lifei, Xu Guangzhong
Surgery Centre of Diabetes Mellitus, Beijing Shijitan Hospital, Capital Medical University, Beijing, China.
Department of General Surgery, Beijing Shijitan Hospital, Capital Medical University, Beijing, China.
Int J Surg. 2025 Feb 1;111(2):1814-1824. doi: 10.1097/JS9.0000000000002179.
The global prevalence of non-alcoholic fatty liver disease (NAFLD) is approximately 30%, and the condition can progress to non-alcoholic steatohepatitis, cirrhosis, and hepatocellular carcinoma. Metabolic and bariatric surgery (MBS) has been shown to be effective in treating obesity and related disorders, including NAFLD.
In this study, comprehensive machine learning was used to identify biomarkers for precise treatment of NAFLD from the perspective of MBS.
Differential expression and univariate logistic regression analyses were performed on lipid metabolism-related genes in a training dataset (GSE83452) and two validation datasets (GSE106737 and GSE48452) to identify consensus-predicted genes (CPGs). Subsequently, 13 machine learning algorithms were integrated into 99 combinations; among which the optimal combination was selected based on the total score of the area under the curve, accuracy, F-score, and recall in the two validation datasets. Hub genes were selected based on their importance ranking in the algorithms and the frequency of their occurrence. Finally, a mouse model of MBS was established, and the mRNA expression of the hub genes was validated via quantitative PCR.
A total of 12 CPGs were identified after intersecting the results of differential expression and logistic regression analyses on a Venn diagram. Four machine learning algorithms with the highest total scores were identified as optimal models. Additionally, PPARA, PLIN2, MED13, INSIG1, CPT1A, and ALOX5AP were identified as hub genes. The mRNA expression patterns of these genes in mice subjected to MBS were consistent with those observed in the three datasets.
Altogether, the six hub genes identified in this study are important for the treatment of NAFLD via MBS and hold substantial promise in guiding personalized treatment of NAFLD in clinical settings.
非酒精性脂肪性肝病(NAFLD)的全球患病率约为30%,该疾病可进展为非酒精性脂肪性肝炎、肝硬化和肝细胞癌。代谢和减重手术(MBS)已被证明在治疗肥胖症及相关疾病(包括NAFLD)方面有效。
在本研究中,运用综合机器学习从MBS的角度识别用于NAFLD精准治疗的生物标志物。
对一个训练数据集(GSE83452)以及两个验证数据集(GSE106737和GSE48452)中的脂质代谢相关基因进行差异表达和单变量逻辑回归分析,以识别一致性预测基因(CPG)。随后,将13种机器学习算法整合为99种组合;基于两个验证数据集中曲线下面积、准确性、F分数和召回率的总分,从中选择最佳组合。根据中心基因在算法中的重要性排名及其出现频率来选择中心基因。最后,建立MBS小鼠模型,并通过定量PCR验证中心基因的mRNA表达。
在维恩图上对差异表达和逻辑回归分析结果进行交叉分析后,共识别出12个CPG。确定了总分最高的四种机器学习算法为最佳模型。此外,PPARA、PLIN2、MED13、INSIG1、CPT1A和ALOX5AP被识别为中心基因。这些基因在接受MBS的小鼠中的mRNA表达模式与在三个数据集中观察到的一致。
总之,本研究中识别出的六个中心基因对于通过MBS治疗NAFLD很重要,在指导临床环境中NAFLD的个性化治疗方面具有很大前景。