School of Electrical and Computer Engineering, University of Oklahoma, Norman, OK 73019, USA.
School of Electrical and Computer Engineering, University of Oklahoma, Norman, OK 73019, USA.
Comput Methods Programs Biomed. 2021 Mar;200:105937. doi: 10.1016/j.cmpb.2021.105937. Epub 2021 Jan 15.
Non-invasively predicting the risk of cancer metastasis before surgery can play an essential role in determining which patients can benefit from neoadjuvant chemotherapy. This study aims to investigate and test the advantages of applying a random projection algorithm to develop and optimize a radiomics-based machine learning model to predict peritoneal metastasis in gastric cancer patients using a small and imbalanced computed tomography (CT) image dataset.
A retrospective dataset involving CT images acquired from 159 patients is assembled, including 121 and 38 cases with and without peritoneal metastasis, respectively. A computer-aided detection scheme is first applied to segment primary gastric tumor volumes and initially compute 315 image features. Then, five gradients boosting machine (GBM) models embedded with five feature selection methods (including random projection algorithm, principal component analysis, least absolute shrinkage, and selection operator, maximum relevance and minimum redundancy, and recursive feature elimination) along with a synthetic minority oversampling technique, are built to predict the risk of peritoneal metastasis. All GBM models are trained and tested using a leave-one-case-out cross-validation method.
Results show that the GBM model embedded with a random projection algorithm yields a significantly higher prediction accuracy (71.2%) than the other four GBM models (p<0.05). The precision, sensitivity, and specificity of this optimal GBM model are 65.78%, 43.10%, and 87.12%, respectively.
This study demonstrates that CT images of the primary gastric tumors contain discriminatory information to predict the risk of peritoneal metastasis, and a random projection algorithm is a promising method to generate optimal feature vector, improving the performance of machine learning based prediction models.
在手术前非侵入性地预测癌症转移的风险对于确定哪些患者可以从新辅助化疗中获益至关重要。本研究旨在探讨并验证应用随机投影算法来开发和优化基于放射组学的机器学习模型,以利用小型和不平衡的计算机断层扫描(CT)图像数据集预测胃癌患者腹膜转移的优势。
我们收集了一个包含 159 名患者 CT 图像的回顾性数据集,其中 121 例和 38 例患者分别存在和不存在腹膜转移。首先,应用计算机辅助检测方案对原发性胃肿瘤体积进行分割,并初步计算 315 个图像特征。然后,构建了五个梯度提升机(GBM)模型,这些模型分别嵌入了五种特征选择方法(包括随机投影算法、主成分分析、最小绝对值收缩和选择算子、最大相关性和最小冗余、递归特征消除)以及一种合成少数过采样技术,以预测腹膜转移的风险。所有 GBM 模型均采用留一病例交叉验证方法进行训练和测试。
结果表明,嵌入随机投影算法的 GBM 模型的预测准确率(71.2%)明显高于其他四个 GBM 模型(p<0.05)。该最佳 GBM 模型的精确率、敏感度和特异度分别为 65.78%、43.10%和 87.12%。
本研究表明,原发性胃肿瘤的 CT 图像包含可用于预测腹膜转移风险的鉴别信息,随机投影算法是生成最优特征向量的一种很有前途的方法,可提高基于机器学习的预测模型的性能。