Division of Hematology and Oncology, Medical College of Wisconsin, Milwaukee, Wisconsin 53226.
Biol Blood Marrow Transplant. 2012 Nov;18(11):1649-55. doi: 10.1016/j.bbmt.2012.05.005. Epub 2012 Jun 9.
In 2005, a National Institutes of Health consensus conference was held to refine methods for research in patients with chronic graft-versus-host disease, including proposed objective response measures and a provisional algorithm for calculating organ-specific and overall response. In this study, we used weighted kappa statistics to evaluate the level of agreement between clinician response ratings and calculated response categories in patients with chronic graft-versus-host disease. The study included 290 patients who had paired enrollment and follow-up visits. Based on a set of objective measures, 37% of the patients had an overall complete or partial response, whereas clinicians reported an overall complete or partial response rate of 71% (slight to fair agreement, weighted kappa 0.20). Agreement rates between calculated organ-specific responses and clinician-reported changes in skin, mouth, and eyes were fair to moderate (weighted kappa, 0.28-0.54). We conclude that for both overall and organ-specific comparisons, clinician response ratings did not agree well with calculated response categories. Possible reasons for this discrepancy include a high clinical sensitivity for detecting response, a clinical predisposition to recognize selective improvements as overall response, the large change in objective measures proposed to define response, and the high incidence of progressive disease based on new manifestations. Conclusions from prior literature reporting high overall response rates based on clinician judgment would not be supported if the provisional algorithm had been applied to calculate response. Our analysis also highlights the need to define an overall response measure that incorporates both patient-reported and objective measures and accurately reflects the outcome in patients with a mixed response in which one organ or site improves, whereas another shows new involvement.
2005 年,美国国立卫生研究院召开了一次共识会议,以改进慢性移植物抗宿主病患者的研究方法,包括提出的客观反应措施和计算特定器官和总体反应的临时算法。在这项研究中,我们使用加权 Kappa 统计来评估慢性移植物抗宿主病患者的临床医生反应评分和计算反应类别之间的一致性水平。该研究包括 290 名具有配对入组和随访的患者。基于一组客观指标,37%的患者有整体完全或部分反应,而临床医生报告的整体完全或部分反应率为 71%(轻微至适度一致,加权 Kappa 为 0.20)。计算的特定器官反应与临床医生报告的皮肤、口腔和眼睛变化之间的一致性率为公平至中度(加权 Kappa,0.28-0.54)。我们得出结论,对于整体和特定器官的比较,临床医生的反应评分与计算的反应类别不一致。这种差异的可能原因包括检测反应的临床敏感性高、临床倾向于将选择性改善识别为整体反应、提出的定义反应的客观指标变化大以及基于新表现的进行性疾病发生率高。如果应用临时算法计算反应,那么基于临床医生判断报告的高整体反应率的先前文献中的结论将得不到支持。我们的分析还强调需要定义一种整体反应测量方法,该方法结合了患者报告和客观测量,并准确反映出混合反应患者的结果,即一个器官或部位改善,而另一个部位出现新的受累。