Welch Nicole, Montgomery Blake K, Ross Kirsten, Mota Frank, Mo Michelle, Grigoriou Emmanouil, Tarchala Magdalena, Roaten John, Miller Patricia, Hedequist Daniel, Birch Craig M
Department of Orthopedic Surgery, Boston Children's Hospital, Boston, USA.
Cureus. 2024 Jul 18;16(7):e64851. doi: 10.7759/cureus.64851. eCollection 2024 Jul.
Objective This study aimed to assess the reliability and reproducibility of the AO Spine Thoracolumbar Injury Classification System by using virtual reality (VR). We hypothesized that VR is a highly reliable and reproducible method to classify traumatic spine injuries. Methods VR 3D models were created from CT scans of 26 pediatric patients with thoracolumbar spine injuries. Seven orthopedic trainees were educated on the VR platform and AO Spine Thoracolumbar Injury Classification System. Classifications were summarized by primary class and subclass for both rater readings performed two weeks apart with image order randomized. Intra-observer reproducibility was quantified by Fleiss's kappa (kF) for primary classifications and Krippendorff's alpha (aK) for subclassifications along with 95% confidence intervals (CIs) for each rater and across all raters. Inter-observer reliability was quantified by kF for primary classifications and aK for subclassifications along with 95% CIs across all raters for the first read, the second read, and all reads combined. The interpretations were as follows: 0-0.2: slight; 0.2-0.4: fair; 0.4-0.6: moderate; 0.6-0.8: substantial; and >0.8: almost perfect agreement. Results A total of 364 classifications were submitted by seven raters. Intra-observer reproducibility ranged from moderate (kF=0.55) to almost perfect (kF=0.94) for primary classifications and from substantial (aK=0.68) to almost perfect (aK=0.91) for subclassifications. Reproducibility was substantial across all raters for the primary class (kF=0.71; 95% CI=0.61-9.82) and subclass (aK=0.79; 95% CI=0.69-0.86). Inter-observer reliability was substantial (kF=0.63; 95% CI=0.57-0.69) for the first read, moderate (kF=0.58; 95% CI=0.52-0.64) for the second read, and substantial (kF=0.61; 95% CI=0.56-0.65) for all reads for primary classifications. For subclassifications, inter-observer reliability was substantial (aK=0.74; 95% CI=0.58-0.83) for the first read, second read (aK=0.70; 95% CI=0.53-0.80), and all reads (aK=0.72; 95% CI=0.60-0.79). Conclusions Based on our findings, VR is a reliable and reproducible method for the classification of pediatric spine trauma, besides its ability to function as an educational tool for trainees. Further research is needed to evaluate its application for other spine conditions.
目的 本研究旨在通过虚拟现实(VR)评估AO脊柱胸腰椎损伤分类系统的可靠性和可重复性。我们假设VR是一种对创伤性脊柱损伤进行分类的高度可靠且可重复的方法。方法 从26例胸腰椎脊柱损伤的儿科患者的CT扫描中创建VR 3D模型。7名骨科实习生在VR平台和AO脊柱胸腰椎损伤分类系统方面接受了培训。分类结果按主要类别和子类别进行汇总,两次评分间隔两周,图像顺序随机。通过Fleiss卡方(kF)对主要分类进行观察者内可重复性量化,通过Krippendorff阿尔法(aK)对子分类进行量化,并为每个评分者以及所有评分者计算95%置信区间(CI)。通过kF对主要分类进行观察者间可靠性量化,通过aK对子分类进行量化,并为所有评分者在第一次阅读、第二次阅读以及所有阅读合并时计算95%CI。解释如下:0 - 0.2:轻微;0.2 - 0.4:一般;0.4 - 0.6:中等;0.6 - 0.8:显著;>0.8:几乎完全一致。结果 7名评分者共提交了364次分类。主要分类的观察者内可重复性从中等(kF = 0.55)到几乎完美(kF = 0.94),子分类的可重复性从显著(aK = 0.68)到几乎完美(aK = 0.91)。所有评分者对主要类别(kF = 0.71;95%CI = 0.61 - 9.82)和子类别(aK = 0.79;95%CI = 0.69 - 0.86)的可重复性均显著。主要分类的观察者间可靠性在第一次阅读时显著(kF = 0.63;95%CI = 0.57 - 0.69),第二次阅读时中等(kF = 0.58;95%CI = 0.52 - 0.64),所有阅读合并时显著(kF = 0.61;95%CI = 0.56 - 0.65)。对于子分类,观察者间可靠性在第一次阅读时显著(aK = 0.74;95%CI = 0.58 - 0.83),第二次阅读时(aK = 0.70;95%CI = 0.53 - 0.80),所有阅读时(aK = 0.72;95%CI = 0.60 - 0.79)。结论 根据我们的研究结果,除了作为实习生的教育工具外,VR是一种用于儿科脊柱创伤分类的可靠且可重复的方法。需要进一步研究以评估其在其他脊柱疾病中的应用。