Department of Neurosurgery, Hospital 12 de Octubre, Universidad Complutense de Madrid, Spain.
J Neurosurg. 2013 Jan;118(1):84-93. doi: 10.3171/2012.8.JNS12100. Epub 2012 Sep 21.
There were two main purposes to this study: first, to assess the feasibility and reliability of 2 quantitative methods to assess bleeding volume in patients who suffered spontaneous subarachnoid hemorrhage (SAH), and second, to compare these methods to other qualitative and semiquantitative scales in terms of reliability and accuracy in predicting delayed cerebral ischemia (DCI) and outcome.
A prospective series of 150 patients consecutively admitted to the Hospital 12 de Octubre over a 4-year period were included in the study. All of these patients had a diagnosis of SAH, and diagnostic CT was able to be performed in the first 24 hours after the onset of the symptoms. All CT scans were evaluated by 2 independent observers in a blinded fashion, using 2 different quantitative methods to estimate the aneurysmal bleeding volume: region of interest (ROI) volume and the Cavalieri method. The images were also graded using the Fisher scale, modified Fisher scale, Claasen scale, and the semiquantitative Hijdra scale. Weighted κ coefficients were calculated for assessing the interobserver reliability of qualitative scales and the Hijdra scores. For assessing the intermethod and interrater reliability of volumetric measurements, intraclass correlation coefficients (ICCs) were used as well as the methodology proposed by Bland and Altman. Finally, weighted κ coefficients were calculated for the different quartiles of the volumetric measurements to make comparison with qualitative scales easier. Patients surviving more than 48 hours were included in the analysis of DCI predisposing factors and analyzed using the chi-square or the Mann-Whitney U-tests. Logistic regression analysis was used for predicting DCI and outcome in the different quartiles of bleeding volume to obtain adjusted ORs. The diagnostic accuracy of each scale was obtained by calculating the area under the receiver operating characteristic curve (AUC).
Qualitative scores showed a moderate interobserver reproducibility (weighted κ indexes were always < 0.65), whereas the semiquantitative and quantitative scores had a very strong interobserver reproducibility. Reliability was very high for all quantitative measures as expressed by the ICCs for intermethod and interobserver agreement. Poor outcome and DCI occurred in 49% and 31% of patients, respectively. Larger bleeding volumes were related to a poorer outcome and a higher risk of developing DCI, and the proportion of patients suffering DCI or a poor outcome increased with each quartile, maintaining this relationship after adjusting for the main clinical factors related to outcome. Quantitative analysis of total bleeding volume achieved the highest AUC, and had a greater discriminative ability than the qualitative scales for predicting the development of DCI and outcome.
The use of quantitative measures may reduce interobserver variability in comparison with categorical scales. These measures are feasible using dedicated software and show a better prognostic capability in relation to outcome and DCI than conventional categorical scales.
本研究主要有两个目的:第一,评估两种定量方法评估自发性蛛网膜下腔出血(SAH)患者出血量的可行性和可靠性;第二,与其他定性和半定量量表比较,评估这些方法在预测迟发性脑缺血(DCI)和预后方面的可靠性和准确性。
本研究纳入了在过去 4 年中连续入住 12 月 12 日医院的 150 名患者。所有这些患者均诊断为 SAH,症状发作后 24 小时内可进行诊断性 CT 检查。所有 CT 扫描均由 2 名独立观察者以盲法进行评估,使用 2 种不同的定量方法估计动脉瘤出血体积:感兴趣区(ROI)体积和 Cavalieri 方法。还使用 Fisher 分级、改良 Fisher 分级、Claasen 分级和半定量 Hijdra 分级对图像进行分级。加权 κ 系数用于评估定性量表的观察者间可靠性和 Hijdra 评分。为了评估容积测量的组内和组间可靠性,还使用了组内相关系数(ICC)以及 Bland 和 Altman 提出的方法。最后,计算了容积测量的不同四分位数的加权 κ 系数,以便更轻松地与定性量表进行比较。超过 48 小时存活的患者被纳入 DCI 易患因素分析,并使用卡方检验或曼-惠特尼 U 检验进行分析。使用逻辑回归分析获得不同出血量四分位数的 DCI 和预后的调整优势比。通过计算受试者工作特征曲线(AUC)下面积来获得每个量表的诊断准确性。
定性评分显示观察者间具有中等的可重复性(加权 κ 指数始终<0.65),而半定量和定量评分具有非常强的观察者间可重复性。所有定量测量的可靠性都非常高,组内和组间一致性的 ICC 表示。分别有 49%和 31%的患者出现不良结局和 DCI。较大的出血量与不良结局和发生 DCI 的风险增加相关,且随着每个四分位数的增加,发生 DCI 或不良结局的患者比例增加,在调整与结局相关的主要临床因素后,仍保持这种关系。总出血量的定量分析获得了最高的 AUC,与传统的定性量表相比,其对预测 DCI 和结局的区分能力更强。
与分类量表相比,使用定量测量可能会减少观察者间的变异性。这些方法使用专用软件是可行的,与传统的分类量表相比,在预后和 DCI 方面具有更好的预后能力。