评估与运动医学相关变量测量误差（可靠性）的统计方法。

Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine.

作者信息

Atkinson G, Nevill A M

机构信息

Research Institute for Sport and Exercise Sciences, Liverpool John Moores University, England.

出版信息

Sports Med. 1998 Oct;26(4):217-38. doi: 10.2165/00007256-199826040-00002.

DOI:10.2165/00007256-199826040-00002

PMID:9820922

Abstract

Minimal measurement error (reliability) during the collection of interval- and ratio-type data is critically important to sports medicine research. The main components of measurement error are systematic bias (e.g. general learning or fatigue effects on the tests) and random error due to biological or mechanical variation. Both error components should be meaningfully quantified for the sports physician to relate the described error to judgements regarding 'analytical goals' (the requirements of the measurement tool for effective practical use) rather than the statistical significance of any reliability indicators. Methods based on correlation coefficients and regression provide an indication of 'relative reliability'. Since these methods are highly influenced by the range of measured values, researchers should be cautious in: (i) concluding acceptable relative reliability even if a correlation is above 0.9; (ii) extrapolating the results of a test-retest correlation to a new sample of individuals involved in an experiment; and (iii) comparing test-retest correlations between different reliability studies. Methods used to describe 'absolute reliability' include the standard error of measurements (SEM), coefficient of variation (CV) and limits of agreement (LOA). These statistics are more appropriate for comparing reliability between different measurement tools in different studies. They can be used in multiple retest studies from ANOVA procedures, help predict the magnitude of a 'real' change in individual athletes and be employed to estimate statistical power for a repeated-measures experiment. These methods vary considerably in the way they are calculated and their use also assumes the presence (CV) or absence (SEM) of heteroscedasticity. Most methods of calculating SEM and CV represent approximately 68% of the error that is actually present in the repeated measurements for the 'average' individual in the sample. LOA represent the test-retest differences for 95% of a population. The associated Bland-Altman plot shows the measurement error schematically and helps to identify the presence of heteroscedasticity. If there is evidence of heteroscedasticity or non-normality, one should logarithmically transform the data and quote the bias and random error as ratios. This allows simple comparisons of reliability across different measurement tools. It is recommended that sports clinicians and researchers should cite and interpret a number of statistical methods for assessing reliability. We encourage the inclusion of the LOA method, especially the exploration of heteroscedasticity that is inherent in this analysis. We also stress the importance of relating the results of any reliability statistic to 'analytical goals' in sports medicine.

摘要

在收集区间型和比率型数据时，最小测量误差（可靠性）对运动医学研究至关重要。测量误差的主要组成部分是系统偏差（例如测试中的一般学习或疲劳效应）以及由于生物或机械变异导致的随机误差。对于运动医学医生而言，这两种误差成分都应进行有意义的量化，以便将所描述的误差与关于“分析目标”（测量工具有效实际使用的要求）的判断相关联，而不是与任何可靠性指标的统计显著性相关联。基于相关系数和回归的方法提供了“相对可靠性”的指标。由于这些方法受测量值范围的影响很大，研究人员在以下方面应谨慎：（i）即使相关性高于0.9也得出可接受的相对可靠性结论；（ii）将重测相关性的结果外推到参与实验的新个体样本；（iii）比较不同可靠性研究之间的重测相关性。用于描述“绝对可靠性”的方法包括测量标准误差（SEM）、变异系数（CV）和一致性界限（LOA）。这些统计数据更适合比较不同研究中不同测量工具之间的可靠性。它们可用于方差分析程序的多次重测研究，有助于预测个体运动员“真实”变化的幅度，并用于估计重复测量实验的统计功效。这些方法在计算方式上有很大差异，并且它们的使用还假定存在（CV）或不存在（SEM）异方差性。计算SEM和CV的大多数方法表示样本中“平均”个体重复测量中实际存在的误差的约68%。LOA表示总体中95%的重测差异。相关的布兰德 - 奥特曼图示意性地显示了测量误差，并有助于识别异方差性的存在。如果有证据表明存在异方差性或非正态性，则应对数转换数据，并将偏差和随机误差作为比率引用。这允许对不同测量工具的可靠性进行简单比较。建议运动临床医生和研究人员引用并解释多种评估可靠性的统计方法。我们鼓励纳入LOA方法，特别是对该分析中固有的异方差性的探索。我们还强调将任何可靠性统计结果与运动医学中的“分析目标”相关联的重要性。

相似文献

Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine.评估与运动医学相关变量测量误差（可靠性）的统计方法。

Sports Med. 1998 Oct;26(4):217-38. doi: 10.2165/00007256-199826040-00002.

Measures of reliability in sports medicine and science.运动医学与科学中的可靠性测量。

Sports Med. 2000 Jul;30(1):1-15. doi: 10.2165/00007256-200030010-00001.

Reliability and validity of a vertical numerical rating scale supplemented with a faces rating scale in measuring fatigue after stroke.在测量中风后疲劳时，垂直数字评定量表辅以面部表情评定量表的信度和效度。

Health Qual Life Outcomes. 2015 Jun 30;13:91. doi: 10.1186/s12955-015-0290-9.

Statistical Primer for Athletic Trainers: The Essentials of Understanding Measures of Reliability and Minimal Important Change.运动训练员统计学基础：理解可靠性和最小临床重要变化指标的要点。

J Athl Train. 2018 Jan;53(1):98-103. doi: 10.4085/1062-6050-503-16. Epub 2018 Jan 13.

Typical error versus limits of agreement.典型误差与一致性界限

Sports Med. 2000 Nov;30(5):375-81. doi: 10.2165/00007256-200030050-00005.

Test-retest reliability of skeletal muscle oxygenation measurements during submaximal cycling exercise in patients with chronic heart failure.慢性心力衰竭患者次极量骑行运动期间骨骼肌氧合测量的重测信度

Clin Physiol Funct Imaging. 2017 Jan;37(1):68-78. doi: 10.1111/cpf.12269. Epub 2015 Jul 3.

Reliability of intestinal temperature using an ingestible telemetry pill system during exercise in a hot environment.在炎热环境中运动期间使用可摄入式遥测药丸系统测量肠道温度的可靠性。

J Strength Cond Res. 2014 Mar;28(3):861-9. doi: 10.1519/JSC.0b013e3182aa5dd0.

Test-retest reliability and minimal detectable change scores of twelve functional fitness tests in adults with Down syndrome.唐氏综合征成年人十二项功能性体能测试的重测信度及最小可检测变化值

Res Dev Disabil. 2016 Jan;48:176-85. doi: 10.1016/j.ridd.2015.10.022. Epub 2015 Nov 21.

[The assessment of biological maturation for talent selection - which method can be used?].[用于人才选拔的生物成熟度评估——可采用哪种方法？]

Sportverletz Sportschaden. 2015 Mar;29(1):56-63. doi: 10.1055/s-0034-1399043. Epub 2015 Feb 24.

The importance of addressing heteroscedasticity in the reliability analysis of ratio-scaled variables: an example based on walking energy-cost measurements.解决等方差性在比率刻度变量可靠性分析中的重要性：基于步行能量成本测量的实例。

Dev Med Child Neurol. 2012 Mar;54(3):267-73. doi: 10.1111/j.1469-8749.2011.04164.x. Epub 2011 Dec 7.

引用本文的文献

A systematic review using a multi-layered criteria framework for assessing the validity and reliability of velocity monitoring devices in resistance training.一项使用多层标准框架评估阻力训练中速度监测设备有效性和可靠性的系统评价。

PLoS One. 2025 Sep 8;20(9):e0324606. doi: 10.1371/journal.pone.0324606. eCollection 2025.

Effect of Different Footwear on the Knee Joint: Biomechanical Analysis and Acute T2 Relaxation Time Changes After Walking in Minimalistic and Neutral Footwear.不同鞋类对膝关节的影响：穿着简约型和普通型鞋类行走后的生物力学分析及急性T2弛豫时间变化

Orthop J Sports Med. 2025 Aug 29;13(8):23259671251346985. doi: 10.1177/23259671251346985. eCollection 2025 Aug.

Intrasession and Intersession Reliability of Flexibility Tests During Developmental Years: The Effects of Sport, Age, and Sex.发育阶段柔韧性测试的 session 内和 session 间可靠性：运动、年龄和性别的影响。

Sports (Basel). 2025 Jul 22;13(8):238. doi: 10.3390/sports13080238.

Artificial intelligence-enhanced assessment of fundamental motor skills: validity and reliability of the FUS test for jumping rope performance.人工智能增强的基本运动技能评估：跳绳表现的FUS测试的有效性和可靠性。

Front Artif Intell. 2025 Aug 4;8:1611534. doi: 10.3389/frai.2025.1611534. eCollection 2025.

Reliability and reactivity of heart rate variability and pupillometry in response to controlled autonomic perturbations in university students.大学生心率变异性和瞳孔测量法在应对自主神经受控扰动时的可靠性和反应性。

Behav Res Methods. 2025 Aug 19;57(9):267. doi: 10.3758/s13428-025-02793-1.

The Effects of Familiarisation on Countermovement Jumps with Handheld Dumbbell Accentuated Eccentric Loading in Youth Athletes.熟悉化对青少年运动员手持哑铃加重离心负荷的反向跳跃的影响。

Eur J Sport Sci. 2025 Sep;25(9):e70033. doi: 10.1002/ejsc.70033.

EF1α, rather than CMV promoter, is suitable for luciferase tag expression in target cells for cytotoxicity assays of CAR-T cells.对于CAR-T细胞的细胞毒性测定，EF1α而非CMV启动子适用于在靶细胞中进行荧光素酶标签表达。

Mol Ther Methods Clin Dev. 2025 Jul 17;33(3):101537. doi: 10.1016/j.omtm.2025.101537. eCollection 2025 Sep 11.

Skeletal muscle mass estimation in Brazilian Jiu-Jitsu athletes: validation of predictive equations.巴西柔术运动员骨骼肌质量评估：预测方程的验证

Front Nutr. 2025 Jul 18;12:1595259. doi: 10.3389/fnut.2025.1595259. eCollection 2025.

Exploring Methodological Decisions for Calculating the Minimally Detectable Change in Dysarthria: Reliability, Statistics, and Standard Error of Measurement.探索计算构音障碍最小可检测变化的方法学决策：信度、统计学及测量标准误

J Speech Lang Hear Res. 2025 Aug 12;68(8):3771-3788. doi: 10.1044/2025_JSLHR-24-00899. Epub 2025 Jul 21.

Translation, cross-cultural adaptation and psychometric properties of the Arabic version of the Fremantle Knee Awareness Questionnaire in people with knee osteoarthritis.弗里曼特尔膝关节认知问卷阿拉伯语版在膝骨关节炎患者中的翻译、跨文化调适及心理测量特性

PLoS One. 2025 Jul 15;20(7):e0328228. doi: 10.1371/journal.pone.0328228. eCollection 2025.

本文引用的文献

A comparison of skinfold thickness, body mass index, bioelectrical impedance analysis and dual-energy X-ray absorptiometry in assessing body composition in obese subjects before and after weight loss.在评估肥胖受试者减肥前后身体成分时，皮褶厚度、体重指数、生物电阻抗分析和双能X线吸收法的比较。

Clin Nutr. 1994 Jun;13(3):177-82. doi: 10.1016/0261-5614(94)90098-1.

Reproducibility of self-paced treadmill performance of trained endurance runners.训练有素的耐力跑者自定节奏跑步机表现的可重复性。

Int J Sports Med. 1998 Jan;19(1):48-51. doi: 10.1055/s-2007-971879.

Assessing agreement between measurements recorded on a ratio scale in sports medicine and sports science.评估运动医学与运动科学中以比率量表记录的测量值之间的一致性。

Br J Sports Med. 1997 Dec;31(4):314-8. doi: 10.1136/bjsm.31.4.314.

Why the analysis of performance variables recorded on a ratio scale will invariably benefit from a log transformation.为什么对比率尺度记录的性能变量进行分析总是会受益于对数变换。

J Sports Sci. 1997 Oct;15(5):457-8.

Reproducibility of ballistic movement.弹道运动的可重复性。

Med Sci Sports Exerc. 1997 Oct;29(10):1383-8. doi: 10.1097/00005768-199710000-00017.

Sample size estimation in studies monitoring exercise-induced bronchoconstriction in asthmatic children.监测哮喘儿童运动诱发性支气管收缩的研究中的样本量估计。

Thorax. 1997 Aug;52(8):739-41. doi: 10.1136/thx.52.8.739.

Effect on measurement error on tests of statistical significance.对统计显著性检验中测量误差的影响。

J Clin Exp Neuropsychol. 1997 Jun;19(3):458-62. doi: 10.1080/01688639708403872.

Use of the standard error as a reliability index of interest: an applied example using elbow flexor strength data.将标准误用作感兴趣的可靠性指标：一个使用肘屈肌力量数据的应用实例。

Phys Ther. 1997 Jul;77(7):745-50. doi: 10.1093/ptj/77.7.745.

A formula for the probability of discordant classification in method comparison studies.方法比较研究中不一致分类概率的一个公式。

Stat Med. 1997 Mar 30;16(6):705-10. doi: 10.1002/(sici)1097-0258(19970330)16:6<705::aid-sim443>3.0.co;2-q.

Reliability of a 1-h endurance performance test in trained female cyclists.训练有素的女性自行车运动员1小时耐力性能测试的可靠性

Med Sci Sports Exerc. 1997 Apr;29(4):554-9. doi: 10.1097/00005768-199704000-00019.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

评估与运动医学相关变量测量误差（可靠性）的统计方法。

Statistical methods for assessing measurement error (reliability) in variables relevant to sports medicine.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献