Mayer T G, Kondraske G, Beals S B, Gatchel R J
Department of Orthopedic Surgery, University of Texas Southwestern Medical Center, Dallas, USA.
Spine (Phila Pa 1976). 1997 Sep 1;22(17):1976-84. doi: 10.1097/00007632-199709010-00006.
A quantitative construct assessing accuracy and component analysis of sources of error rather than reliability coefficients was tested prospectively in human performance measurements of lumbar spine motion using a cohort of healthy individuals.
To evaluate the accuracy of lumbar spine sagittal motion measurements using a computerized inclinometer, which involved progressive analysis of sources of error to identify the most problematic sub-components of the measurement process and device.
Many previous studies have described the reliability of inclinometric lumbar motion measurement techniques, but with inconsistent analysis about sources of error to explain identified variability. Similar deficiencies exist in identifying sources of error leading to variability for other human performance measurements (e.g., strength, endurance. lifting capacity, etc.). Yet, range of motion has important clinical applications in monitoring progress and assessing temporary and/or permanent impairment. This makes it especially important for clinicians to be able to recognize and correct factors that limit accurate measurements affecting clinical utility.
A computerized inclinometer was used for measuring the sagittal lumbar mobility of 38 healthy individuals after bench testing the device itself for device error. The human performance test conditions were: 1) initial test on study participants by untrained test administrators with no control of human performance or procedural variables, 2) identical tests by procedurally trained test administrators controlling human performance variability by monitoring and controlling total motion, and 3) test by procedurally trained test administrators without controlling for human performance variability.
The accuracy of the methodology progressively was degraded by the various sources of error. Device error was negligible relative to error associated with the test process itself. Lack of test administrator training and the magnitude of the measured quantity were the major factors in test degradation. Combined (gross) lumbar flexion was the most accurate measure (worst case > 95% accuracy for overall test conditions), whereas pelvic extension was the least accurate (worst case > 36%).
Clinical utility of lumbar spine sagittal motion measurement is highly sensitive to test administrator training to bridge pitfalls to measurement accuracy (bony landmarks, "rocking" of inclinometer on sacrum, etc.). Magnitude of the measurement is another important accuracy factor because absolute error tends to remain relatively constant. Device accuracy is usually an insignificant component of overall test accuracy. Analysis of human performance measurements, such as spinal range of motion, may be facilitated by physics-based assessment of accuracy and procedural error in providing more sophisticated analysis than is customarily accessible through reliability coefficients. Previous studies often failed to recognize correctable procedural errors, rarely addressed them, and almost never quantitated them.
在对一组健康个体进行腰椎运动的人体性能测量中,前瞻性地测试了一种定量结构,该结构评估准确性和误差来源的成分分析而非可靠性系数。
使用计算机倾角仪评估腰椎矢状面运动测量的准确性,这涉及对误差来源进行逐步分析,以识别测量过程和设备中最成问题的子组件。
许多先前的研究描述了倾角测量腰椎运动测量技术的可靠性,但对误差来源的分析不一致,无法解释所确定的变异性。在识别导致其他人体性能测量(如力量、耐力、举重能力等)变异性的误差来源方面也存在类似缺陷。然而,活动范围在监测进展和评估临时和/或永久性损伤方面具有重要的临床应用。这使得临床医生能够识别和纠正限制影响临床效用的准确测量的因素尤为重要。
在对设备本身进行设备误差的台架测试后,使用计算机倾角仪测量38名健康个体的腰椎矢状面活动度。人体性能测试条件为:1)由未经训练的测试管理人员对研究参与者进行初始测试,不控制人体性能或程序变量;2)由经过程序训练的测试管理人员进行相同测试,通过监测和控制总运动来控制人体性能变异性;3)由经过程序训练的测试管理人员进行测试,不控制人体性能变异性。
该方法的准确性因各种误差来源而逐渐降低。相对于与测试过程本身相关的误差,设备误差可忽略不计。测试管理人员缺乏培训和测量量的大小是测试质量下降的主要因素。联合(总体)腰椎前屈是最准确的测量方法(在总体测试条件下,最坏情况的准确率>95%),而骨盆后伸是最不准确的(最坏情况的准确率>36%)。
腰椎矢状面运动测量的临床效用对测试管理人员的培训高度敏感,以克服测量准确性方面的缺陷(骨性标志、倾角仪在骶骨上的“晃动”等)。测量量的大小是另一个重要的准确性因素,因为绝对误差往往保持相对恒定。设备准确性通常是总体测试准确性中一个微不足道的组成部分。对人体性能测量(如脊柱活动范围)的分析,通过基于物理学的准确性和程序误差评估,可能比通过可靠性系数通常可获得的分析更复杂,从而得到促进。先前的研究常常未能识别可纠正的程序误差,很少涉及这些误差,几乎从未对其进行量化。