Matboli Marwa, Diab Gouda I, Saad Maha, Khaled Abdelrahman, Roushdy Marian, Ali Marwa, ELsawi Hind A, Aboughaleb Ibrahim H
Department of Medical Biochemistry and Molecular Biology, Faculty of Medicine Ain Shams University, Cairo 11566, Egypt.
Biomedical Engineering Department, Egyptian Armed Forces, Cairo, Egypt.
J Clin Exp Hepatol. 2024 Nov-Dec;14(6):101456. doi: 10.1016/j.jceh.2024.101456. Epub 2024 Jun 14.
Hepatocellular carcinoma (HCC) is the third prime cause of malignancy-related mortality worldwide. Early and accurate identification of HCC is crucial for good prognosis, efficacy of therapy, and survival rates of the patients. We aimed to develop a machine-learning model incorporating differentially expressed RNA signatures with laboratory parameters to construct an RNA signature-based diagnostic model for HCC.
We have used five classifiers (KNN, RF, SVM, LGBM, and DNNs) to predict the liver disease (HCC). The classifiers were trained on 187 samples and then tested on 80 samples. The model included 22 features (age, sex, smoking, cirrhosis, non-cirrhosis, albumin, ALT, AST bilirubin (total and direct), INR, AFP, HBV Ag, HCV Abs, RQmiR-1298, RQmiR-1262, RQmiR-106b-3p, RQmRNARAB11A, and RQSTAT1, RQmRNAATG12, RQLnc-WRAP53, RQLncRNA- RP11-513I15.6).
LGBM achieved the highest accuracy of 98.75% in predicting HCC among all models surpassing Random Forest (96.25%), DNN (91.25%), SVC (88.75%), and KNN (87.50%).
Our machine-learning model incorporating the expression data of RAB11A/STAT1/ATG12/miR-1262/miR-1298/miR-106b-3p/lncRNA-RP11-513I15.6/lncRNA-WRAP53 signature and clinical data represents a potential novel diagnostic model for HCC.
肝细胞癌(HCC)是全球恶性肿瘤相关死亡的第三大主要原因。早期准确识别HCC对于患者的良好预后、治疗效果和生存率至关重要。我们旨在开发一种机器学习模型,将差异表达的RNA特征与实验室参数相结合,构建基于RNA特征的HCC诊断模型。
我们使用了五种分类器(KNN、RF、SVM、LGBM和DNN)来预测肝病(HCC)。这些分类器在187个样本上进行训练,然后在80个样本上进行测试。该模型包括22个特征(年龄、性别、吸烟、肝硬化、非肝硬化、白蛋白、ALT、AST胆红素(总胆红素和直接胆红素)、INR、AFP、HBV抗原、HCV抗体、RQmiR-1298、RQmiR-1262、RQmiR-106b-3p、RQmRNARAB11A、RQSTAT1、RQmRNAATG12、RQLnc-WRAP53、RQLncRNA-RP11-513I15.6)。
在所有模型中,LGBM在预测HCC方面达到了最高准确率98.75%,超过了随机森林(96.25%)、DNN(91.25%)、SVC(88.75%)和KNN(87.50%)。
我们结合RAB11A/STAT1/ATG12/miR-1262/miR-1298/miR-106b-3p/lncRNA-RP11-513I15.6/lncRNA-WRAP53特征的表达数据和临床数据的机器学习模型代表了一种潜在的新型HCC诊断模型。