Department of Orthopaedic Surgery, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA.
J Bone Joint Surg Am. 2013 Sep 4;95(17):1600-4. doi: 10.2106/JBJS.L.00586.
Interobserver reliability for the classification of proximal humeral fractures is limited. The aim of this study was to test the null hypothesis that interobserver reliability of the AO classification of proximal humeral fractures, the preferred treatment, and fracture characteristics is the same for two-dimensional (2-D) and three-dimensional (3-D) computed tomography (CT).
Members of the Science of Variation Group--fully trained practicing orthopaedic and trauma surgeons from around the world--were randomized to evaluate radiographs and either 2-D CT or 3-D CT images of fifteen proximal humeral fractures via a web-based survey and respond to the following four questions: (1) Is the greater tuberosity displaced? (2) Is the humeral head split? (3) Is the arterial supply compromised? (4) Is the glenohumeral joint dislocated? They also classified the fracture according to the AO system and indicated their preferred treatment of the fracture (operative or nonoperative). Agreement among observers was assessed with use of the multirater kappa (κ) measure.
Interobserver reliability of the AO classification, fracture characteristics, and preferred treatment generally ranged from "slight" to "fair." A few small but statistically significant differences were found. Observers randomized to the 2-D CT group had slightly but significantly better agreement on displacement of the greater tuberosity (κ = 0.35 compared with 0.30, p < 0.001) and on the AO classification (κ = 0.18 compared with 0.17, p = 0.018). A subgroup analysis of the AO classification results revealed that shoulder and elbow surgeons, orthopaedic trauma surgeons, and surgeons in the United States had slightly greater reliability on 2-D CT, whereas surgeons in practice for ten years or less and surgeons from other subspecialties had slightly greater reliability on 3-D CT.
Proximal humeral fracture classifications may be helpful conceptually, but they have poor interobserver reliability even when 3-D rather than 2-D CT is utilized. This may contribute to the similarly poor interobserver reliability that was observed for selection of the treatment for proximal humeral fractures. The lack of a reliable classification confounds efforts to compare the outcomes of treatment methods among different clinical trials and reports.
对于肱骨近端骨折的分类,观察者间的可靠性是有限的。本研究的目的是检验一个零假设,即对于肱骨近端骨折的 AO 分类、首选治疗方法和骨折特征,二维(2-D)和三维(3-D)计算机断层扫描(CT)的观察者间可靠性是相同的。
科学变异组的成员——来自世界各地的经过充分培训的骨科和创伤外科医生——通过网络调查被随机分配评估十五例肱骨近端骨折的 X 线片和二维 CT 或三维 CT 图像,并回答以下四个问题:(1)大结节是否移位?(2)肱骨头是否分裂?(3)动脉供应是否受损?(4)盂肱关节是否脱位?他们还根据 AO 系统对骨折进行分类,并指出他们对骨折的首选治疗方法(手术或非手术)。使用多评分者kappa(κ)测量评估观察者间的一致性。
AO 分类、骨折特征和首选治疗方法的观察者间可靠性通常在“轻微”到“公平”之间。有一些小但有统计学意义的差异。被随机分配到二维 CT 组的观察者在大结节移位(κ=0.35 与 0.30,p<0.001)和 AO 分类(κ=0.18 与 0.17,p=0.018)上的一致性略好。对 AO 分类结果的亚组分析显示,肩部和肘部外科医生、骨科创伤外科医生和美国的外科医生在二维 CT 上的可靠性略高,而从业十年或更短时间的外科医生和来自其他亚专业的外科医生在三维 CT 上的可靠性略高。
肱骨近端骨折的分类在概念上可能是有帮助的,但即使使用三维而不是二维 CT,它们的观察者间可靠性也很差。这可能导致对肱骨近端骨折治疗方法的选择也存在类似的观察者间可靠性差。缺乏可靠的分类会影响在不同临床试验和报告中比较治疗方法结果的努力。