Leslie Paula, Drinnan Michael J, Finn Paul, Ford Gary A, Wilson Janet A
School of Surgical and Reproductive Sciences, The Medical School, University of Newcastle, Newcastle-upon-Tyne, United Kingdom.
Dysphagia. 2004 Fall;19(4):231-40.
Cervical auscultation is experiencing a renaissance as an adjunct to the clinical swallowing assessment. It is a controversial technique with a small evidence base. We have aimed to establish whether cervical auscultation interpretation is based on the actual sounds heard or, in practice, influenced by information gleaned from other aspects of the clinical assessment, medical notes, or previous knowledge. We sought to determine (a) rater reliability and its impact on the clinical value of cervical auscultation and (b) how judgments compare with the "gold standard": videofluoroscopy. Swallow sounds were computer recorded via a Littmann stethoscope. Sounds were sampled from 10 healthy control swallows with no aspiration/penetration and 10 patient swallows with aspiration/penetration, all recorded during simultaneous videofluoroscopy. The system generated sound quality similar to "live" bedside listening, a feature rarely seen in cervical auscultation studies. The 20 sound clips were classified as "normal" or "abnormal" by 19 volunteer speech-language pathologists with experience in cervical auscultation. After at least four weeks, 11 of these judges rated the sounds rerandomized on a new CD. Intrarater reliability kappa ranged from -0.12 to 0.71. Individual reliability did not correlate with years of experience, practice pattern, or frequency of use. Interrater reliability kappa = 0.17. Comparison with radiologically defined aspiration/penetration yielded 66% specificity, 62% sensitivity, and majority consensus gave 90% specificity, 80% sensitivity. There was a significant relationship between individual reliability and true positive rate (r(s) = 0.623, p = 0.040). The reliability of individual judges varied widely and thus, inevitably, agreement between judges was poor. Validity is dependent upon reliability: Improving the poor raters would improve the overall accuracy of this technique in predicting abnormality in swallowing. The group consensus correctly identified 17 of the 20 clips so we may speculate that the swallow sound contains audible cues that should in principle permit reliable classification.
作为临床吞咽评估的辅助手段,颈部听诊正在复兴。这是一项存在争议且证据基础薄弱的技术。我们旨在确定颈部听诊的解读是基于实际听到的声音,还是在实践中受到从临床评估的其他方面、病历或既往知识中收集到的信息的影响。我们试图确定:(a) 评估者的可靠性及其对颈部听诊临床价值的影响;(b) 判断结果与 “金标准”——视频荧光吞咽造影检查相比如何。吞咽声音通过 Littmann 听诊器进行计算机记录。声音样本来自 10 名无误吸/渗透的健康对照者的吞咽以及 10 名有误吸/渗透的患者的吞咽,所有样本均在同步视频荧光吞咽造影检查期间记录。该系统生成的声音质量类似于 “现场” 床边听诊,这一特点在颈部听诊研究中很少见。19 名有颈部听诊经验的志愿言语病理学家将这 20 个声音片段分类为 “正常” 或 “异常”。至少四周后,其中 11 名评判者对重新随机排列在一张新 CD 上的声音进行评分。评估者内部可靠性卡方值范围为 -0.12 至 0.71。个体可靠性与经验年限、实践模式或使用频率无关。评估者间可靠性卡方值 = 0.17。与放射学定义的误吸/渗透相比,特异性为 66%,敏感性为 62%,多数一致性结果显示特异性为 90%,敏感性为 80%。个体可靠性与真阳性率之间存在显著关系(r(s) = 0.623,p = 0.040)。个体评判者的可靠性差异很大,因此,评判者之间的一致性不可避免地很差。效度取决于信度:提高评分较低的评判者的水平将提高该技术预测吞咽异常的总体准确性。小组共识正确识别了 20 个片段中的 17 个,因此我们可以推测吞咽声音包含原则上应能实现可靠分类的可听线索。