IEEE J Biomed Health Inform. 2022 Mar;26(3):1239-1250. doi: 10.1109/JBHI.2021.3102090. Epub 2022 Mar 7.
Knee osteoarthritis (OA) is a chronic disease that considerably reduces patients' quality of life. Preventive therapies require early detection and lifetime monitoring of OA progression. In the clinical environment, the severity of OA is classified by the Kellgren and Lawrence (KL) grading system, ranging from KL-0 to KL-4. Recently, deep learning methods were applied to OA severity assessment to improve accuracy and efficiency. However, this task is still challenging due to the ambiguity between adjacent grades, especially in early-stage OA. Low confident samples, which are less representative than the typical ones, undermine the training process. Targeting the uncertainty in the OA dataset, we propose a novel learning scheme that dynamically separates the data into two sets according to their reliability. Besides, we design a hybrid loss function to help CNN learn from the two sets accordingly. With the proposed approach, we emphasize the typical samples and control the impacts of low confident cases. Experiments are conducted in a five-fold manner on five-class task and early-stage OA task. Our method achieves a mean accuracy of 70.13% on the five-class OA assessment task, which outperforms all other state-of-art methods. Despite early-stage OA detection still benefiting from the human intervention of lesion region selection, our approach achieves superior performance on the KL-0 vs. KL-2 task. Moreover, we design an experiment to validate large-scale automatic data refining during training. The result verifies the ability to characterize low confidence samples. The dataset used in this paper was obtained from the Osteoarthritis Initiative.
膝骨关节炎(OA)是一种慢性疾病,会极大地降低患者的生活质量。预防疗法需要早期发现和终身监测 OA 进展。在临床环境中,OA 的严重程度通过 Kellgren 和 Lawrence(KL)分级系统进行分类,范围从 KL-0 到 KL-4。最近,深度学习方法被应用于 OA 严重程度评估,以提高准确性和效率。然而,由于相邻等级之间的模糊性,特别是在早期 OA 中,这项任务仍然具有挑战性。低置信度样本比典型样本代表性更低,会破坏训练过程。针对 OA 数据集的不确定性,我们提出了一种新的学习方案,根据数据的可靠性将其动态地分为两组。此外,我们设计了一种混合损失函数,帮助 CNN 相应地从这两组数据中进行学习。通过所提出的方法,我们强调了典型样本并控制了低置信度样本的影响。我们在五折交叉验证的基础上进行了五类 OA 评估任务和早期 OA 任务的实验。我们的方法在五类 OA 评估任务中取得了 70.13%的平均准确率,优于所有其他最先进的方法。尽管早期 OA 检测仍然受益于病变区域选择的人工干预,但我们的方法在 KL-0 与 KL-2 任务上的表现更优。此外,我们设计了一个实验来验证训练过程中的大规模自动数据精炼能力。实验结果验证了对低置信度样本进行特征描述的能力。本文使用的数据集来自骨关节炎倡议。