Department of Pediatrics, Columbia University College of Physicians and Surgeons, New York, New York, USA.
Pediatrics. 2012 Apr;129(4):695-700. doi: 10.1542/peds.2011-2037. Epub 2012 Mar 5.
Our objective was to determine the interrater reliability of clinical history and physical examination findings in children undergoing evaluation for possible appendicitis in a large, multicenter cohort.
We conducted a prospective, multicenter, cross-sectional study of children aged 3-18 years with possible appendicitis. Two clinicians independently evaluated patients and completed structured case report forms within 60 minutes of each other and without knowing the results of diagnostic imaging. We calculated raw agreement and assessed reliability by using the unweighted Cohen κ statistic with 2-sided 95% confidence intervals.
A total of 811 patients had 2 assessments completed, and 599 (74%) had 2 assessments completed within 60 minutes. Seventy-five percent of paired assessments were completed by pediatric emergency physicians. Raw agreement ranged from 64.9% to 92.3% for history variables and 4 of 6 variables had moderate interrater reliability (κ > .4). The highest κ values were noted for duration of pain (κ = .56 [95% confidence intervals .51-.61]) and history of emesis (.84 [.80-.89]). For physical examination variables, raw agreement ranged from 60.9% to 98.7%, with 4 of 8 variables exhibiting moderate reliability. Among physical examination variables, the highest κ values were noted for abdominal pain with walking, jumping, or coughing (.54 [.45-.63]) and presence of any abdominal tenderness on examination (.49 [.19-.80]).
Interrater reliability of patient history and physical examination variables was generally fair to moderate. Those variables with higher interrater reliability are more appropriate for inclusion in clinical prediction rules in children with possible appendicitis.
我们的目的是在一个大型多中心队列中,确定对疑似阑尾炎患儿进行评估时临床病史和体格检查结果的观察者间可靠性。
我们对 3-18 岁疑似阑尾炎的患儿进行了一项前瞻性、多中心、横断面研究。两位临床医生在彼此 60 分钟内独立评估患者并完成结构化病例报告表,且不知道诊断影像学的结果。我们计算了原始一致性,并使用未加权 Cohen κ 统计量和双侧 95%置信区间评估可靠性。
共有 811 例患者完成了 2 次评估,599 例(74%)在 60 分钟内完成了 2 次评估。75%的配对评估由儿科急诊医生完成。病史变量的原始一致性范围为 64.9%至 92.3%,4 个变量中有 6 个具有中度观察者间可靠性(κ>0.4)。疼痛持续时间(κ=0.56[95%置信区间 0.51-0.61])和呕吐史(κ=0.84[0.80-0.89])的 κ 值最高。体格检查变量的原始一致性范围为 60.9%至 98.7%,8 个变量中有 4 个具有中度可靠性。在体格检查变量中,与行走、跳跃或咳嗽时腹痛(κ=0.54[0.45-0.63])和检查时存在任何腹部压痛(κ=0.49[0.19-0.80])的 κ 值最高。
患者病史和体格检查变量的观察者间可靠性通常为中等至良好。那些观察者间可靠性较高的变量更适合纳入疑似阑尾炎患儿的临床预测规则。