University of Manitoba, Winnipeg, Manitoba, Canada.
University of Minnesota, Minneapolis, Minnesota, USA.
Bone. 2022 Aug;161:116427. doi: 10.1016/j.bone.2022.116427. Epub 2022 Apr 27.
Convolutional neural networks (CNNs) can identify vertebral compression fractures in GE vertebral fracture assessment (VFA) images with high balanced accuracy, but performance against Hologic VFAs is unknown. To obtain good classification performance, supervised machine learning requires balanced and labeled training data. Active learning is an iterative data annotation process with the ability to reduce the cost of labeling medical image data and reduce class imbalance.
To train CNNs to identify vertebral fractures in Hologic VFAs using an active learning approach, and evaluate the ability of CNNs to generalize to both Hologic and GE VFA images.
VFAs were obtained from the OsteoLaus Study (labeled Hologic Discovery A, n = 2726), the Manitoba Bone Mineral Density Program (labeled GE Prodigy and iDXA, n = 12,742), and the Canadian Longitudinal Study on Aging (CLSA, unlabeled Hologic Discovery A, n = 17,190). Unlabeled CLSA VFAs were split into five equal-sized partitions (n = 3438) and reviewed sequentially using active learning. Based on predicted fracture probability, 17.6% (n = 3032) of the unlabeled VFAs were selected for expert review using the modified algorithm-based qualitative (mABQ) method. CNNs were simultaneously trained on Hologic, GE dual-energy and GE single-energy VFAs. Two ensemble CNNs were constructed using the maximum and mean predicted probability from six separately trained CNNs that differed due to stochastic variation. CNNs were evaluated against the OsteoLaus validation set (n = 408) during the active learning process; ensemble performance was measured against the OsteoLaus test set (n = 819).
The baseline CNN, prior to active learning, achieved 55.0% sensitivity, 97.9% specificity, 57.9% positive predictive value (PPV), F1-score 56.4%. Through active learning, 2942 CLSA Hologic VFAs (492 fractures) were added to the training data-increasing the proportion of Hologic VFAs with fractures from 4.2% to 12.5%. With active learning, CNN performance improved to 80.0% sensitivity, 99.7% specificity, 94.1% PPV, F1-score 86.5%. The CNN maximum ensemble achieved 91.9% sensitivity (100% for grade 3 and 95.5% for grade 2 fractures), 99.0% specificity, 81.0% PPV, F1-score 86.1%.
Simultaneously training on a composite dataset consisting of both Hologic and GE VFAs allowed for the development of a single manufacturer-independent CNN that generalized to both scanner types with good classification performance. Active learning can reduce class imbalance and produce an effective medical image classifier while only labeling a subset of available unlabeled image data-thereby reducing the time and cost required to train a machine learning model.
卷积神经网络 (CNN) 可以通过高平衡准确率识别 GE 椎体骨折评估 (VFA) 图像中的椎体压缩性骨折,但对 Hologic VFA 的性能尚不清楚。为了获得良好的分类性能,监督机器学习需要平衡和标记的训练数据。主动学习是一种具有减少医学图像数据标注成本和减少类别不平衡能力的迭代数据标注过程。
使用主动学习方法训练 CNN 以识别 Hologic VFA 中的椎体骨折,并评估 CNN 对 Hologic 和 GE VFA 图像进行泛化的能力。
从 OsteoLaus 研究中获得 VFA(标记为 Hologic Discovery A,n=2726)、马尼托巴骨密度计划(标记为 GE Prodigy 和 iDXA,n=12742)和加拿大老龄化纵向研究(CLSA,未标记的 Hologic Discovery A,n=17190)。未标记的 CLSA VFA 被分为五个相等大小的分区(n=3438),并使用主动学习依次进行回顾。基于预测骨折概率,使用修改后的基于算法的定性 (mABQ) 方法选择 17.6%(n=3032)的未标记 VFA 进行专家审查。CNN 同时在 Hologic、GE 双能和 GE 单能 VFA 上进行训练。使用六个单独训练的 CNN 中最大和平均预测概率构建了两个集合 CNN,这些 CNN 因随机变化而有所不同。在主动学习过程中,CNN 针对 OsteoLaus 验证集(n=408)进行了评估;针对 OsteoLaus 测试集(n=819)进行了集合性能评估。
在主动学习之前,基线 CNN 实现了 55.0%的灵敏度、97.9%的特异性、57.9%的阳性预测值 (PPV)、F1 分数 56.4%。通过主动学习,在训练数据中添加了 2942 个 CLSA Hologic VFA(492 个骨折)-将 Hologic VFA 中骨折的比例从 4.2%增加到 12.5%。通过主动学习,CNN 性能提高到 80.0%的灵敏度、99.7%的特异性、94.1%的 PPV、F1 分数 86.5%。CNN 最大集合实现了 91.9%的灵敏度(3 级骨折为 100%,2 级骨折为 95.5%)、99.0%的特异性、81.0%的 PPV、F1 分数 86.1%。
同时在由 Hologic 和 GE VFA 组成的复合数据集上进行训练,允许开发一种单一的制造商独立的 CNN,该 CNN 可以对两种扫描仪类型进行泛化,具有良好的分类性能。主动学习可以减少类别不平衡,并在仅标记可用未标记图像数据的一部分的情况下生成有效的医学图像分类器-从而减少训练机器学习模型所需的时间和成本。