Department of Medical Psychology, Medical Sociology, and Rehabilitation Sciences, University of Wuerzburg, Klinikstr. 3, 97070, Wuerzburg, Germany.
Department of Orthopaedics, University Medical Center Hamburg-Eppendorf, Martinistr. 52, 20246, Hamburg, Germany.
BMC Health Serv Res. 2019 Aug 9;19(1):556. doi: 10.1186/s12913-019-4387-4.
Employees insured in pension insurance, who are incapable of working due to ill health, are entitled to a disability pension. To assess whether an individual meets the medical requirements to be considered as disabled, a work capacity evaluation is conducted. However, there are no official guidelines on how to perform an external quality assurance for this evaluation process. Furthermore, the quality of medical reports in the field of insurance medicine can vary substantially, and systematic evaluations are scarce. Reliability studies using peer review have repeatedly shown insufficient ability to distinguish between high, moderate and low quality. Considering literature recommendations, we developed an instrument to examine the quality of medical experts' reports.
The peer review manual developed contains six quality domains (formal structure, clarity, transparency, completeness, medical-scientific principles, and efficiency) comprising 22 items. In addition, a superordinate criterion (survey confirmability) rank the overall quality and usefulness of a report. This criterion evaluates problems of inner logic and reasoning. Development of the manual was assisted by experienced physicians in a pre-test. We examined the observable variance in peer judgements and reliability as the most important outcome criteria. To evaluate inter-rater reliability, 20 anonymous experts' reports detailing the work capacity evaluation were reviewed by 19 trained raters (peers). Percentage agreement and Kendall's W, a reliability measure of concordance between two or more peers, were calculated. A total of 325 reviews were conducted.
Agreement of peer judgements with respect to the superordinate criterion ranged from 29.2 to 87.5%. Kendall's W for the quality domain items varied greatly, ranging from 0.09 to 0.88. With respect to the superordinate criterion, Kendall's W was 0.39, which indicates fair agreement. The results of the percentage agreement revealed systemic peer preferences for certain deficit scale categories.
The superordinate criterion was not sufficiently reliable. However, in comparison to other reliability studies, this criterion showed an equivalent reliability value. This report aims to encourage further efforts to improve evaluation instruments. To reduce disagreement between peer judgments, we propose the revision of the peer review instrument and the development and implementation of a standardized rater training to improve reliability.
参加养老保险的员工,因病丧失劳动能力的,可以领取病残津贴。为了评估个人是否符合被认定为残疾的医学要求,需要进行劳动能力评估。然而,对于如何对这一评估过程进行外部质量保证,目前尚无官方指南。此外,医疗保险领域的医疗报告质量差异很大,而且系统的评估也很少。使用同行评议进行的可靠性研究反复表明,区分高质量、中等质量和低质量的能力不足。考虑到文献建议,我们开发了一种工具来检查医学专家报告的质量。
开发的同行评议手册包含六个质量领域(形式结构、清晰度、透明度、完整性、医学科学原则和效率),共 22 项。此外,一个上级标准(调查可确认性)对报告的整体质量和有用性进行评估。该标准评估内部逻辑和推理的问题。手册的开发得到了一位经验丰富的医生在预测试中的协助。我们将可观察到的同行判断差异和可靠性作为最重要的结果标准进行了检查。为了评估评分者间的可靠性,由 19 名经过培训的评分者(同行)对 20 份匿名专家报告进行了详细的劳动能力评估审查。计算了百分比一致性和 Kendall's W,这是两个或更多同行之间一致性的可靠性度量。总共进行了 325 次审查。
同行对上级标准的判断意见的一致性从 29.2%到 87.5%不等。质量域项目的 Kendall's W 差异很大,从 0.09 到 0.88 不等。就上级标准而言,Kendall's W 为 0.39,表明存在中等程度的一致性。百分比一致性的结果显示,评分者对某些缺陷量表类别存在系统偏好。
上级标准的可靠性不足。然而,与其他可靠性研究相比,该标准具有相当的可靠性值。本报告旨在鼓励进一步努力改进评估工具。为了减少同行判断之间的分歧,我们建议修订同行评议工具,并制定和实施标准化评分员培训,以提高可靠性。