Krajnc Denis, Papp Laszlo, Nakuz Thomas S, Magometschnigg Heinrich F, Grahovac Marko, Spielvogel Clemens P, Ecsedi Boglarka, Bago-Horvath Zsuzsanna, Haug Alexander, Karanikas Georgios, Beyer Thomas, Hacker Marcus, Helbich Thomas H, Pinker Katja
QIMP Team, Center for Medical Physics and Biomedical Engineering, Medical University of Vienna, 1090 Vienna, Austria.
Division of Nuclear Medicine, Department of Biomedical Imaging and Image-Guided Therapy, Medical University of Vienna, 1090 Vienna, Austria.
Cancers (Basel). 2021 Mar 12;13(6):1249. doi: 10.3390/cancers13061249.
: This study investigated the performance of ensemble learning holomic models for the detection of breast cancer, receptor status, proliferation rate, and molecular subtypes from [F]FDG-PET/CT images with and without incorporating data pre-processing algorithms. Additionally, machine learning (ML) models were compared with conventional data analysis using standard uptake value lesion classification. : A cohort of 170 patients with 173 breast cancer tumors (132 malignant, 38 benign) was examined with [F]FDG-PET/CT. Breast tumors were segmented and radiomic features were extracted following the imaging biomarker standardization initiative (IBSI) guidelines combined with optimized feature extraction. Ensemble learning including five supervised ML algorithms was utilized in a 100-fold Monte Carlo (MC) cross-validation scheme. Data pre-processing methods were incorporated prior to machine learning, including outlier and borderline noisy sample detection, feature selection, and class imbalance correction. Feature importance in each model was assessed by calculating feature occurrence by the R-squared method across MC folds. : Cross validation demonstrated high performance of the cancer detection model (80% sensitivity, 78% specificity, 80% accuracy, 0.81 area under the curve (AUC)), and of the triple negative tumor identification model (85% sensitivity, 78% specificity, 82% accuracy, 0.82 AUC). The individual receptor status and luminal A/B subtype models yielded low performance (0.46-0.68 AUC). SUV model yielded 0.76 AUC in cancer detection and 0.70 AUC in predicting triple negative subtype. : Predictive models based on [F]FDG-PET/CT images in combination with advanced data pre-processing steps aid in breast cancer diagnosis and in ML-based prediction of the aggressive triple negative breast cancer subtype.
本研究调查了集成学习全基因组模型在从[F]FDG-PET/CT图像中检测乳腺癌、受体状态、增殖率和分子亚型方面的性能,同时考虑了是否纳入数据预处理算法。此外,还将机器学习(ML)模型与使用标准摄取值病变分类的传统数据分析进行了比较。
对170例患有173个乳腺肿瘤(132个恶性,38个良性)的患者队列进行了[F]FDG-PET/CT检查。按照影像生物标志物标准化倡议(IBSI)指南并结合优化的特征提取方法,对乳腺肿瘤进行分割并提取影像组学特征。在100倍蒙特卡洛(MC)交叉验证方案中使用了包括五种监督式ML算法的集成学习。在机器学习之前纳入了数据预处理方法,包括异常值和边界噪声样本检测、特征选择和类不平衡校正。通过在MC折中使用R平方方法计算特征出现次数来评估每个模型中的特征重要性。
交叉验证表明癌症检测模型具有高性能(灵敏度80%,特异性78%,准确率80%,曲线下面积(AUC)为0.81),三阴性肿瘤识别模型也具有高性能(灵敏度85%,特异性78%,准确率82%,AUC为0.82)。个体受体状态和管腔A/B亚型模型的性能较低(AUC为0.46 - 0.68)。SUV模型在癌症检测中的AUC为0.76,在预测三阴性亚型中的AUC为0.70。
基于[F]FDG-PET/CT图像并结合先进数据预处理步骤的预测模型有助于乳腺癌诊断以及基于ML的侵袭性三阴性乳腺癌亚型预测。