Li Matthew D, Chang Ken, Bearce Ben, Chang Connie Y, Huang Ambrose J, Campbell J Peter, Brown James M, Singh Praveer, Hoebel Katharina V, Erdoğmuş Deniz, Ioannidis Stratis, Palmer William E, Chiang Michael F, Kalpathy-Cramer Jayashree
1Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Boston, MA USA.
2Division of Musculoskeletal Imaging and Intervention, Department of Radiology, Massachusetts General Hospital, Boston, MA USA.
NPJ Digit Med. 2020 Mar 26;3:48. doi: 10.1038/s41746-020-0255-1. eCollection 2020.
Using medical images to evaluate disease severity and change over time is a routine and important task in clinical decision making. Grading systems are often used, but are unreliable as domain experts disagree on disease severity category thresholds. These discrete categories also do not reflect the underlying continuous spectrum of disease severity. To address these issues, we developed a convolutional Siamese neural network approach to evaluate disease severity at single time points and change between longitudinal patient visits on a continuous spectrum. We demonstrate this in two medical imaging domains: retinopathy of prematurity (ROP) in retinal photographs and osteoarthritis in knee radiographs. Our patient cohorts consist of 4861 images from 870 patients in the Imaging and Informatics in Retinopathy of Prematurity (i-ROP) cohort study and 10,012 images from 3021 patients in the Multicenter Osteoarthritis Study (MOST), both of which feature longitudinal imaging data. Multiple expert clinician raters ranked 100 retinal images and 100 knee radiographs from excluded test sets for severity of ROP and osteoarthritis, respectively. The Siamese neural network output for each image in comparison to a pool of normal reference images correlates with disease severity rank ( = 0.87 for ROP and = 0.89 for osteoarthritis), both within and between the clinical grading categories. Thus, this output can represent the continuous spectrum of disease severity at any single time point. The difference in these outputs can be used to show change over time. Alternatively, paired images from the same patient at two time points can be directly compared using the Siamese neural network, resulting in an additional continuous measure of change between images. Importantly, our approach does not require manual localization of the pathology of interest and requires only a binary label for training (same versus different). The location of disease and site of change detected by the algorithm can be visualized using an occlusion sensitivity map-based approach. For a longitudinal binary change detection task, our Siamese neural networks achieve test set receiving operator characteristic area under the curves (AUCs) of up to 0.90 in evaluating ROP or knee osteoarthritis change, depending on the change detection strategy. The overall performance on this binary task is similar compared to a conventional convolutional deep-neural network trained for multi-class classification. Our results demonstrate that convolutional Siamese neural networks can be a powerful tool for evaluating the continuous spectrum of disease severity and change in medical imaging.
利用医学图像评估疾病严重程度及其随时间的变化是临床决策中的一项常规且重要的任务。分级系统经常被使用,但由于领域专家对疾病严重程度类别阈值存在分歧,所以并不可靠。这些离散类别也没有反映出疾病严重程度潜在的连续谱。为了解决这些问题,我们开发了一种卷积孪生神经网络方法,用于在单个时间点评估疾病严重程度,并在连续谱上评估患者纵向就诊之间的变化。我们在两个医学成像领域进行了验证:视网膜照片中的早产儿视网膜病变(ROP)和膝关节X光片中的骨关节炎。我们的患者队列包括来自早产儿视网膜病变成像与信息学(i-ROP)队列研究中870名患者的4861张图像,以及多中心骨关节炎研究(MOST)中3021名患者的10012张图像,这两个队列都具有纵向成像数据。多名专家临床评估人员分别对排除的测试集中的100张视网膜图像和100张膝关节X光片进行ROP和骨关节炎严重程度的排名。与一组正常参考图像相比,孪生神经网络对每张图像的输出与疾病严重程度排名相关(ROP为0.87,骨关节炎为0.89),在临床分级类别内部和之间均如此。因此,该输出可以代表任何单个时间点疾病严重程度的连续谱。这些输出的差异可用于显示随时间的变化。或者,可以使用孪生神经网络直接比较同一患者在两个时间点的配对图像,从而得到图像之间变化的另一种连续度量。重要的是,我们的方法不需要手动定位感兴趣的病变,并且训练只需要一个二元标签(相同与不同)。通过基于遮挡敏感度图的方法可以可视化算法检测到的疾病位置和变化部位。对于纵向二元变化检测任务,根据变化检测策略,我们的孪生神经网络在评估ROP或膝关节骨关节炎变化时,测试集接收者操作特征曲线下面积(AUC)高达0.90。与为多类分类训练的传统卷积深度神经网络相比,该二元任务的整体性能相似。我们的结果表明,卷积孪生神经网络可以成为评估医学成像中疾病严重程度连续谱和变化的有力工具。