Granström Hannah, Äng Björn O, Rasmussen-Barr Eva
a Department of Neurobiology, Care Sciences and Society, Division of Physiotherapy , Karolinska Institutet , Stockholm , Sweden.
b Center for Clinical Research Dalarna , Falun , Sweden.
Physiother Theory Pract. 2017 May;33(5):386-397. doi: 10.1080/09593985.2017.1318422.
This study evaluated inter- and intra-observer reliability and discriminative validity of three movement control tests: 1) standing knee-lift test; 2) static lunge test; and 3) dynamic lunge test. Thirty-eight subjects, 21 with low-back pain and 17 healthy, were video-recorded while performing the tests. Four physical therapists scored the tests according to a standardized protocol and calculated a composite score for each test based on the number of incorrect test components. Inter-observer reliability for the composite scores ranged between 0.68 and 0.80 (ICC 2,k) and intra-observer reliability between 0.54 and0.82 (ICC 2,1). The separate test components ranged between 0.32 and 0.91 (κ) for inter-observer reliability and 0.42 and1.00 for intra-observer reliability. Test components showing the highest values were: back extension; arm lowering; and shoulders moving backwards. Components hip hitch, trunk lateral flexion, knee not lifted straight up and hips moving backwards did not reach accepted thresholds. Discriminative validity ranged between 0.47 and 0.56 (AUC). As our results showed an overall good agreement for the composite, scores and for the majority of the included test components the tests can be considered reliable enough. As the tests' discriminative ability was close to "none", they should, however, not be used for diagnostic purposes but should be further evaluated toward predicted validity.
1)站立提膝测试;2)静态弓步测试;3)动态弓步测试。对38名受试者进行测试时进行了视频记录,其中21名患有腰痛,17名健康。四名物理治疗师根据标准化方案对测试进行评分,并根据不正确测试组件的数量计算每个测试的综合得分。综合得分的观察者间信度在0.68至0.80(ICC 2,k)之间,观察者内信度在0.54至0.82(ICC 2,1)之间。观察者间信度方面,各个测试组件的信度在0.32至0.91(κ)之间,观察者内信度在0.42至1.00之间。显示最高值的测试组件为:背部伸展;手臂放下;以及肩膀向后移动。髋部上提、躯干侧屈、膝盖未伸直抬起以及臀部向后移动等组件未达到可接受的阈值。区分效度在0.47至0.56(AUC)之间。由于我们的结果显示综合得分以及大多数纳入的测试组件总体上具有良好的一致性,因此这些测试可被认为足够可靠。然而,由于测试的区分能力接近“无”,它们不应被用于诊断目的,而应进一步评估其预测效度。