Department of Radiology, Shenzhen People's Hospital, No.1017 Dongmen North Road, Luohu District, Shenzhen, Guangdong, 518020, PR China.
Department of Pathology, Shenzhen People's Hospital, No.1017 Dongmen North Road, Luohu District, Shenzhen, Guangdong, 518020, PR China.
Eur J Radiol. 2019 May;114:175-184. doi: 10.1016/j.ejrad.2019.03.015. Epub 2019 Mar 21.
To develop and validate an interpretable and repeatable machine learning model approach to predict molecular subtypes of breast cancer from clinical metainformation together with mammography and MRI images.
We retrospectively assessed 363 breast cancer cases (Luminal A 151, Luminal B 96, HER2 76, and BLBC 40). Eighty-two features defined in the BI-RADS lexicon were visually described. A decision tree model with the Chi-squared automatic interaction detector (CHAID) algorithm was applied for feature selection and classification. A 10-fold cross-validation was performed to investigate the performance (i.e., accuracy, positive predictive value, sensitivity, and F1-score) of the decision tree model.
Seven of the 82 variables were derived from the decision tree-based feature selection and used as features for the classification of molecular subtypes including mass margin calcification on mammography, mass margin types of kinetic curves in the delayed phase, mass internal enhancement characteristics, non-mass enhancement distribution on MRI, and breastfeeding history. The decision tree model accuracy was 74.1%. For each molecular subtype group, Luminal A achieved a sensitivity, positive predictive value, and F1-score of 79.47%, 75.47%, and 77.42%, respectively; Luminal B showed a sensitivity, positive predictive value, and F1-score of 64.58%, 55.86%, and 59.90%, respectively; HER2 had a sensitivity, positive predictive value, and F1-scores of 81.58%, 95.38%, and 87.94%, respectively; BLBC showed sensitivity, positive predictive value, and F1-scores of 62.50%, 89.29%, and 73.53%, respectively.
We applied a complete "white box" machine learning method to predict the molecular subtype of breast cancer based on the BI-RADS feature description in a multi-modal setting. By combining BI-RADS features in both mammography and MRI, the prediction accuracy is boosted and robust. The proposed method can be easily applied widely regardless of variability of imaging vendors and settings because of the applicability and acceptance of the BI-RADS.
开发和验证一种可解释和可重复的机器学习模型方法,以便从临床元数据以及乳房 X 光摄影和 MRI 图像中预测乳腺癌的分子亚型。
我们回顾性评估了 363 例乳腺癌病例(Luminal A 151 例、Luminal B 96 例、HER2 76 例和 BLBC 40 例)。在 BI-RADS 词汇表中定义了 82 个特征,并进行了视觉描述。应用基于卡方自动交互检测(CHAID)算法的决策树模型进行特征选择和分类。采用 10 折交叉验证来研究决策树模型的性能(即准确性、阳性预测值、灵敏度和 F1 评分)。
从基于决策树的特征选择中得出了 82 个变量中的 7 个变量,并将其用作分子亚型分类的特征,包括乳房 X 光摄影中的肿块边缘钙化、延迟期肿块边缘类型的动力学曲线、肿块内部增强特征、MRI 上的非肿块增强分布以及母乳喂养史。决策树模型的准确率为 74.1%。对于每个分子亚型组,Luminal A 的灵敏度、阳性预测值和 F1 评分分别为 79.47%、75.47%和 77.42%;Luminal B 为 64.58%、55.86%和 59.90%;HER2 为 81.58%、95.38%和 87.94%;BLBC 为 62.50%、89.29%和 73.53%。
我们应用了一种完整的“白盒”机器学习方法,基于多模态设置中的 BI-RADS 特征描述来预测乳腺癌的分子亚型。通过结合乳房 X 光摄影和 MRI 中的 BI-RADS 特征,提高了预测准确性和稳健性。由于 BI-RADS 的适用性和可接受性,该方法可以在不考虑成像供应商和设置变化的情况下广泛应用。