F. Schmaranzer, R. Helfenstein, T. D. Lerch, K. A. Siebenrock, M. Tannast, Department of Orthopaedic Surgery, Inselspital Bern, University of Bern, Switzerland, Bern, Switzerland G. Zeng, G. Zheng, Institute for Surgical Technology and Biomechanics, University of Bern, Switzerland, Bern, Switzerland F. Schmaranzer, E. N. Novais, J. D. Wylie, Y-J. Kim, Department of Orthopaedic Surgery, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA.
Clin Orthop Relat Res. 2019 May;477(5):1036-1052. doi: 10.1097/CORR.0000000000000755.
The time-consuming and user-dependent postprocessing of biochemical cartilage MRI has limited the use of delayed gadolinium-enhanced MRI of cartilage (dGEMRIC). An automated analysis of biochemical three-dimensional (3-D) images could deliver a more time-efficient and objective evaluation of cartilage composition, and provide comprehensive information about cartilage thickness, surface area, and volume compared with manual two-dimensional (2-D) analysis.
QUESTIONS/PURPOSES: (1) How does the 3-D analysis of cartilage thickness and dGEMRIC index using both a manual and a new automated method compare with the manual 2-D analysis (gold standard)? (2) How does the manual 3-D analysis of regional patterns of dGEMRIC index, cartilage thickness, surface area and volume compare with a new automatic method? (3) What is the interobserver reliability and intraobserver reproducibility of software-assisted manual 3-D and automated 3-D analysis of dGEMRIC indices, thickness, surface, and volume for two readers on two time points?
In this IRB-approved, retrospective, diagnostic study, we identified the first 25 symptomatic hips (23 patients) who underwent a contrast-enhanced MRI at 3T including a 3-D dGEMRIC sequence for intraarticular pathology assessment due to structural hip deformities. Of the 23 patients, 10 (43%) were male, 16 (64%) hips had a cam deformity and 16 (64%) hips had either a pincer deformity or acetabular dysplasia. The development of an automated deep-learning-based approach for 3-D segmentation of hip cartilage models was based on two steps: First, one reader (FS) provided a manual 3-D segmentation of hip cartilage, which served as training data for the neural network and was used as input data for the manual 3-D analysis. Next, we developed the deep convolutional neural network to obtain an automated 3-D cartilage segmentation that we used as input data for the automated 3-D analysis. For actual analysis of the manually and automatically generated 3-D cartilage models, a dedicated software was developed. Manual 2-D analysis of dGEMRIC indices and cartilage thickness was performed at each "full-hour" position on radial images and served as the gold standard for comparison with the corresponding measurements of the manual and the automated 3-D analysis. We measured dGEMRIC index, cartilage thickness, surface area, and volume for each of the four joint quadrants and compared the manual and the automated 3-D analyses using mean differences. Agreement between the techniques was assessed using intraclass correlation coefficients (ICC). The overlap between 3-D cartilage volumes was assessed using dice coefficients and means of all distances between surface points of the models were calculated as average surface distance. The interobserver reliability and intraobserver reproducibility of the software-assisted manual 3-D and the automated 3-D analysis of dGEMRIC indices, thickness, surface and volume was assessed for two readers on two different time points using ICCs.
Comparable mean overall difference and almost-perfect agreement in dGEMRIC indices was found between the manual 3-D analysis (8 ± 44 ms, p = 0.005; ICC = 0.980), the automated 3-D analysis (7 ± 43 ms, p = 0.015; ICC = 0.982), and the manual 2-D analysis.Agreement for measuring overall cartilage thickness was almost perfect for both 3-D methods (ICC = 0.855 and 0.881) versus the manual 2-D analysis. A mean difference of -0.2 ± 0.5 mm (p < 0.001) was observed for overall cartilage thickness between the automated 3-D analysis and the manual 2-D analysis; no such difference was observed between the manual 3-D and the manual 2-D analysis.Regional patterns were comparable for both 3-D methods. The highest dGEMRIC indices were found posterosuperiorly (manual: 602 ± 158 ms; p = 0.013, automated: 602 ± 158 ms; p = 0.012). The thickest cartilage was found anteroinferiorly (manual: 5.3 ± 0.8 mm, p < 0.001; automated: 4.3 ± 0.6 mm; p < 0.001). The smallest surface area was found anteroinferiorly (manual: 134 ± 60 mm; p < 0.001, automated: 155 ± 60 mm; p < 0.001). The largest volume was found anterosuperiorly (manual: 2343 ± 492 mm; p < 0.001, automated: 2294 ± 467 mm; p < 0.001). Mean average surface distance was 0.26 ± 0.13 mm and mean Dice coefficient was 86% ± 3%. Intraobserver reproducibility and interobserver reliability was near perfect for overall analysis of dGEMRIC indices, thickness, surface area, and volume (ICC range, 0.962-1).
The presented deep learning approach for a fully automatic segmentation of hip cartilage enables an accurate, reliable and reproducible analysis of dGEMRIC indices, thickness, surface area, and volume. This time-efficient and objective analysis of biochemical cartilage composition and morphology yields the potential to improve patient selection in femoroacetabular impingement (FAI) surgery and to aid surgeons with planning of acetabuloplasty and periacetabular osteotomies in pincer FAI and hip dysplasia. In addition, this validation paves way to the large-scale use of this method for prospective trials which longitudinally monitor the effect of reconstructive hip surgery and the natural course of osteoarthritis.
Level III, diagnostic study.
生化性软骨 MRI 的耗时且依赖于用户的后处理限制了延迟钆增强 MRI 软骨成像(dGEMRIC)的应用。三维(3-D)图像的自动化分析可以提供更高效、更客观的软骨成分评估,并与二维(2-D)分析相比,提供更全面的软骨厚度、表面积和体积信息。
问题/目的:(1)手动和新自动化方法的软骨厚度和 dGEMRIC 指数 3-D 分析与手动 2-D 分析(金标准)相比如何?(2)手动的区域性 dGEMRIC 指数、软骨厚度、表面积和体积模式分析与新自动方法相比如何?(3)两位读者在两个时间点使用软件辅助手动 3-D 和自动 3-D 分析 dGEMRIC 指数、厚度、表面积和体积的观察者间可靠性和观察者内可重复性如何?
在这项经过机构审查委员会批准的回顾性诊断研究中,我们确定了前 25 个因结构性髋关节畸形而接受对比增强 MRI(包括关节内病理评估的 3-D dGEMRIC 序列)的有症状髋关节(23 例患者)。23 例患者中,男性 10 例(43%),16 髋有凸轮畸形,16 髋有钳夹畸形或髋臼发育不良。髋关节软骨模型的深度学习方法的开发基于两个步骤:首先,一位读者(FS)进行了髋关节软骨的手动 3-D 分割,作为神经网络的训练数据,并作为手动 3-D 分析的输入数据。接下来,我们开发了深度卷积神经网络,以获得自动的 3-D 软骨分割,作为自动 3-D 分析的输入数据。为了实际分析手动和自动生成的 3-D 软骨模型,开发了专用软件。在放射图像的每个“全小时”位置进行手动 2-D dGEMRIC 指数和软骨厚度的分析,作为比较手动和自动 3-D 分析的金标准。我们测量了每个关节象限的 dGEMRIC 指数、软骨厚度、表面积和体积,并使用平均差异比较了手动和自动 3-D 分析。使用组内相关系数(ICC)评估技术之间的一致性。使用重叠系数评估 3-D 软骨体积的重叠,计算模型表面点之间所有距离的平均值作为平均表面距离。使用 ICC 评估两位读者在两个不同时间点使用软件辅助手动 3-D 和自动 3-D 分析 dGEMRIC 指数、厚度、表面积和体积的观察者间可靠性和观察者内可重复性。
在 dGEMRIC 指数方面,手动 3-D 分析(8±44ms,p=0.005;ICC=0.980)、自动 3-D 分析(7±43ms,p=0.015;ICC=0.982)和手动 2-D 分析之间的平均整体差异和几乎完美的一致性。两种 3-D 方法(ICC=0.855 和 0.881)与手动 2-D 分析相比,测量整体软骨厚度的一致性几乎是完美的。自动 3-D 分析与手动 2-D 分析之间的整体软骨厚度差异为-0.2±0.5mm(p<0.001);而手动 3-D 分析与手动 2-D 分析之间则无此差异。两种 3-D 方法的区域模式具有可比性。最高的 dGEMRIC 指数出现在后上(手动:602±158ms;p=0.013,自动:602±158ms;p=0.012)。最厚的软骨出现在前下(手动:5.3±0.8mm,p<0.001;自动:4.3±0.6mm;p<0.001)。最小的表面积出现在前下(手动:134±60mm;p<0.001,自动:155±60mm;p<0.001)。最大的体积出现在前上(手动:2343±492mm;p<0.001,自动:2294±467mm;p<0.001)。平均平均表面距离为 0.26±0.13mm,平均 Dice 系数为 86%±3%。dGEMRIC 指数、厚度、表面积和体积的整体分析的观察者内可重复性和观察者间可靠性近乎完美(ICC 范围为 0.962-1)。
本文提出的髋关节软骨 3-D 分割的深度学习方法能够实现准确、可靠和可重复的 dGEMRIC 指数、厚度、表面积和体积分析。这种高效、客观的生化软骨成分和形态分析有可能改善髋关节撞击症(FAI)手术患者的选择,并帮助外科医生规划钳夹型 FAI 和髋关节发育不良的髋臼成形术和髋臼周围截骨术。此外,该验证为前瞻性试验铺平了道路,这些试验将纵向监测重建性髋关节手术和骨关节炎自然病程的效果。
III 级,诊断研究。