Grant Andrew D M, Thavendiranathan Paaladinesh, Rodriguez L Leonardo, Kwon Deborah, Marwick Thomas H
Department of Cardiovascular Sciences, University of Calgary, Calgary, Alberta, Canada; Heart and Vascular Institute, Cleveland Clinic, Cleveland, Ohio.
Department of Cardiology, University Health Network, University of Toronto, Toronto, Ontario, Canada; Heart and Vascular Institute, Cleveland Clinic, Cleveland, Ohio.
J Am Soc Echocardiogr. 2014 Mar;27(3):277-84. doi: 10.1016/j.echo.2013.11.016. Epub 2013 Dec 25.
Multiparametric scoring of valvular regurgitation may compromise interobserver agreement, as readers weight parameters differently. The aims of this study were to quantify interobserver variability in the grading of chronic tricuspid regurgitation (TR), develop an algorithm for grading TR, and assess the effect of this algorithm on concordance and accuracy.
On the basis of current guidelines, two experts graded the severity of TR by consensus in 40 patients with a spectrum of TR severity. A subgroup of patients (n = 18) also had TR severity assessed by cardiac magnetic resonance. Sixteen cardiologists independently graded the first 20 cases as severe or nonsevere TR. After group review, a grading algorithm to differentiate severe and nonsevere TR was devised by consensus. The same observers used the algorithm to grade the second set of cases.
Baseline differentiation of severe from nonsevere TR showed modest reliability and accuracy compared with an expert read (multirater κ = 0.55; overall agreement, 78%; accuracy, 81%). The consensus algorithm for severe TR was a suggestive color jet and at least one of (1) right atrial area > 18 cm(2) and inferior vena cava diameter > 2.5 cm; (2) vena contracta width > 0.7 cm and jet area > 10 cm(2); (3) a dense, triangular TR Doppler profile; and (4) holosystolic reversal of hepatic vein flow. Application of this algorithm improved the multirater κ coefficient to 0.80, the level of agreement to 90% (P = .033), and mean reader accuracy to 92% (P = .001).
Only modest baseline agreement was found between readers on the distinction of severe and nonsevere TR. An objective, structured grading algorithm improved both interrater agreement and accuracy.
由于不同读者对参数的权重不同,瓣膜反流的多参数评分可能会影响观察者间的一致性。本研究的目的是量化慢性三尖瓣反流(TR)分级中观察者间的变异性,开发一种TR分级算法,并评估该算法对一致性和准确性的影响。
根据当前指南,两名专家对40例不同严重程度的TR患者的TR严重程度进行了共识分级。一组亚组患者(n = 18)还通过心脏磁共振评估了TR严重程度。16名心脏病专家独立将前20例病例分级为重度或非重度TR。经过小组讨论,通过共识设计了一种区分重度和非重度TR的分级算法。相同的观察者使用该算法对第二组病例进行分级。
与专家解读相比,重度与非重度TR的基线区分显示出适度的可靠性和准确性(多评分者κ = 0.55;总体一致性,78%;准确性,81%)。重度TR的共识算法为提示性彩色血流束,以及以下至少一项:(1)右心房面积> 18 cm²且下腔静脉直径> 2.5 cm;(2)反流束宽度> 0.7 cm且血流束面积> 10 cm²;(3)密集的三角形TR多普勒频谱;(4)肝静脉血流全收缩期逆向血流。应用该算法使多评分者κ系数提高到0.80,一致性水平提高到90%(P = .033),平均读者准确性提高到92%(P = .001)。
读者在区分重度和非重度TR方面仅发现适度的基线一致性。一种客观、结构化的分级算法提高了评分者间的一致性和准确性。