Department of Cardiothoracic Surgery, Affiliated Hospital of Guangdong Medical University, Xiashan District, Zhanjiang, Guangdong, China.
Guangdong Medical Universiy, Xiashan District, Zhanjiang, Guangdong, China.
Sci Rep. 2024 Aug 19;14(1):19215. doi: 10.1038/s41598-024-69735-3.
The aim of this study was to develop a medical imaging and comprehensive stacked learning-based method for predicting high- and low-risk thymoma. A total of 126 patients with thymomas and 5 patients with thymic carcinoma treated at our institution, including 65 low-risk patients and 66 high-risk patients, were retrospectively recruited. Among them, 78 patients composed the training cohort, while the remaining 53 patients formed the validation cohort. We extracted 1702 features each from the patients' arterial-, venous-, and plain-phase images. Pairwise subtraction of these features yielded 1702 arterial-venous, arterial-plain, and venous-plain difference features each. The Mann‒Whitney U test and least absolute shrinkage and selection operator (LASSO) and SelectKBest methods were employed to select the best features from the training set. Six models were built with a stacked learning algorithm. By applying stacked ensemble learning, three machine learning algorithms (XGBoost, multilayer perceptron (MLP), and random forest) were combined by XGBoost to produce the the six basic imaging models. Then, the XGBoost algorithm was applied to the six basic imaging models to construct a combined radiomic model. Finally, the radiomic model was combined with clinical information to create a nomogram that could easily be used in clinical practice to predict the thymoma risk category. The areas under the curve (AUCs) of the combined radiomic model in the training and validation cohorts were 0.999 (95% CI 0.988-1.000) and 0.967 (95% CI 0.916-1.000), respectively, while those of the nomogram were 0.999 (95% CI 0.996-1.000) and 0.983 (95% CI 0.990-1.000). This study describes the application of CT-based radiomics in thymoma patients and proposes a nomogram for predicting the risk category for this disease, which could be advantageous for clinical decision-making for affected patients.
本研究旨在开发一种基于医学影像学和综合堆叠学习的方法,用于预测胸腺瘤的高低风险。共回顾性招募了我院 126 例胸腺瘤患者和 5 例胸腺癌患者,包括 65 例低危患者和 66 例高危患者。其中,78 例患者组成训练队列,其余 53 例患者组成验证队列。我们从患者的动脉期、静脉期和平扫期图像中分别提取了 1702 个特征。这些特征的两两相减分别得到 1702 个动脉-静脉、动脉-平扫和静脉-平扫差异特征。采用 Mann-Whitney U 检验和最小绝对收缩和选择算子(LASSO)和 SelectKBest 方法从训练集中选择最佳特征。使用堆叠学习算法构建了 6 个模型。通过应用堆叠集成学习,通过 XGBoost 将三种机器学习算法(XGBoost、多层感知机(MLP)和随机森林)组合起来,生成了 6 个基本的影像学模型。然后,将 XGBoost 算法应用于这 6 个基本影像学模型,构建一个联合放射组学模型。最后,将放射组学模型与临床信息相结合,创建一个易于在临床实践中使用的列线图,以预测胸腺瘤的风险类别。在训练集和验证集中,联合放射组学模型的曲线下面积(AUC)分别为 0.999(95%CI 0.988-1.000)和 0.967(95%CI 0.916-1.000),而列线图的 AUC 分别为 0.999(95%CI 0.996-1.000)和 0.983(95%CI 0.990-1.000)。本研究描述了 CT 基于放射组学在胸腺瘤患者中的应用,并提出了一种预测疾病风险类别的列线图,这可能有利于为受影响的患者做出临床决策。