Ma Yujing, Duan Shaobo, Ren Shanshan, Bu Didi, Li Yahong, Cai Xiguo, Zhang Lianzhong
Henan University People's Hospital, Henan Provincial People's Hospital, Zhengzhou, China.
Department of Health Management, Henan Provincial People's Hospital, Zhengzhou, China.
Front Med (Lausanne). 2024 Nov 19;11:1483291. doi: 10.3389/fmed.2024.1483291. eCollection 2024.
To investigate the ability of ultrasomics to noninvasively predict epidermal growth factor receptor (EGFR) expression status in patients with hepatocellular carcinoma (HCC).
198 HCC patients were comprised in the study ( = 138 in the training dataset and = 60 in the test dataset). EGFR expression was detected by immunohistochemistry. Ultrasomics features from gray-scale ultrasound images were extracted. Intra-class correlation coefficient (ICC) screening, variance filtering, mutual information method, and extreme gradient boosting (XGboost) embedding method were applied for selecting the best features. Random forest (RF), XGBoost, support vector machine (SVM), decision tree (DT), and logistic regression (LR) 5 machine learning algorithms were used to construct clinical models, ultrasomics models, and clinical-ultrasomics combined models, respectively. Area under the receiver operating characteristic curve (AUC), sensitivity, specificity, accuracy, decision curve analysis (DCA), and calibration curve were used to assess the predictive performance of the model.
In 198 patients, high EGFR expression was observed in 100 patients and low EGFR expression was observed in 98 patients. The RF machine learning ultrasomics model was found to perform well, with the AUC of the training and test dataset being 0.929 (95%CI, 0.874-0.966) and 0.807 (95%CI, 0.684-0.897) respectively, the sensitivity being 0.843 and 0.767 respectively, the specificity being 0.857 and 0.800 respectively, and the accuracy being 0.850 and 0.783, respectively. The predictive performance of the combined model established by integrating ultrasomics features and clinical baseline characteristics was improved, with the AUC, sensitivity, specificity, and accuracy of the RF machine learning combined model for the training and test dataset reaching 0.937 (95%CI, 0.884-0.971), 0.822 (95%CI, 0.702-0.909); 0.857, 0.833; 0.857, 0.800; 0.857, 0.817, respectively.
To predict the status of EGFR expression in HCC patients, the ultrasomics model and combined model created by five machine learning algorithms can be utilized as efficient and noninvasive techniques, and the ultrasomics model and combined model established by RF classifier have the best predictive performance.
探讨超声组学对肝细胞癌(HCC)患者表皮生长因子受体(EGFR)表达状态进行无创预测的能力。
本研究纳入198例HCC患者(训练数据集138例,测试数据集60例)。采用免疫组织化学法检测EGFR表达。提取灰阶超声图像的超声组学特征。应用组内相关系数(ICC)筛选、方差过滤、互信息法和极端梯度提升(XGboost)嵌入法选择最佳特征。分别采用随机森林(RF)、XGBoost、支持向量机(SVM)、决策树(DT)和逻辑回归(LR)5种机器学习算法构建临床模型、超声组学模型和临床-超声组学联合模型。采用受试者操作特征曲线下面积(AUC)、灵敏度、特异度、准确度、决策曲线分析(DCA)和校准曲线评估模型的预测性能。
198例患者中,100例EGFR高表达,98例EGFR低表达。发现RF机器学习超声组学模型表现良好,训练数据集和测试数据集的AUC分别为0.929(95%CI,0.874-0.966)和0.807(95%CI,0.684-0.897),灵敏度分别为0.843和0.767,特异度分别为0.857和0.800,准确度分别为0.850和0.783。整合超声组学特征和临床基线特征建立的联合模型的预测性能得到改善,RF机器学习联合模型在训练和测试数据集的AUC、灵敏度、特异度和准确度分别达到0.937(95%CI,0.884-0.971)、0.822(95%CI,0.702-0.909);0.857、0.833;0.857、0.800;0.857、0.817。
为预测HCC患者的EGFR表达状态,由5种机器学习算法创建的超声组学模型和联合模型可作为高效的无创技术,且由RF分类器建立的超声组学模型和联合模型具有最佳的预测性能。