Baylor University - Keller Army Community Hospital Division 1 Sports Physical Therapy Fellowship, Keller Army Community Hospital, West Point, NY 10996, USA.
U.S. Army Medical Center of Excellence, Army-Baylor University Doctoral Program in Physical Therapy, Fort Sam Houston, TX 78234, USA.
Mil Med. 2023 Aug 29;188(9-10):3079-3085. doi: 10.1093/milmed/usac099.
The U.S. Army is updating the physical fitness assessment for soldiers to the six-event Army Combat Fitness Test (ACFT). A paucity of data regarding the ACFT maximum deadlift (MDL) event, especially in military populations, has increased concern over the objectivity of the test. The reliability of scoring the MDL has not been established. It is unknown if grader professional experience impacts the reliability of scoring, and if so, what level of experience is required for reliable assessment. Performance and assessment of the MDL could impact military occupational selection, promotion, and retention within the Army. The purposes of this study were to determine the inter- and intra-rater reliabilities of raters with varying degrees of professional experience on scoring the MDL and to determine the relationships between load lifted, overall lift success, sex, and body mass index (BMI).
The design is a reliability study. Approval was granted by the Naval Medical Center-Portsmouth Institutional Review Board. Fifty-five healthy soldiers and cadets from the U.S. Military Academy were recruited. Participants completed one data collection session, performing one MDL attempt. The attempt was video recorded using three devices: two handheld tablets placed perpendicular to the sagittal and frontal planes recording at 240 Hz and one digital camera positioned at a 45° angle recording at 30 Hz. A reference standard was established through slow-motion analysis of the sagittal and frontal plane recordings. Six raters with varying degrees of professional experience viewed the 45° camera recordings at real-time speed independently, in a random order, on two separate occasions. Lift success was dichotomously assessed as successful or unsuccessful according to the MDL standards. Cohen's kappa was computed to determine inter- and intra-rater reliabilities among raters. Bivariate correlation was used to assess associations among load lifted, BMI, and sex. A chi-squared test of independence assessed the relationship between sex and overall lift success.
Inter-rater reliability between the six raters ranged from 0.29 to 0.69. Inter-rater reliability between the raters to the reference standard ranged from 0.28 to 0.61. Intra-rater reliability ranged from 0.51 to 0.84. Inter-rater reliability of raters who had attended a Training and Doctrine Command-approved ACFT certification course ranged from 0.51 to 0.66, while those who had not ranged from 0.34 to 0.46. BMI and sex were associated with load lifted (r = 0.405, P = .002; r = -0.727, P < .001, respectively). Overall lift success was not associated with load lifted (r = -0.047, P = .731). Overall lift success was not related to sex (χ2 = 0.271, P = .602).
Inter-rater reliability of the six raters ranged from poor to substantial, while intra-rater reliability ranged from moderate to excellent. Compared to a reference standard, inter-rater reliability ranged from poor to substantial. The wide range in consistency demonstrated in this study, both between and within raters, brings into question the current subjective methods used to grade the MDL. More research is needed to understand the most feasible, valid, and reliable way to assess performance standards like the MDL that may affect a soldier's career progression.
美国陆军正在将士兵的体能评估更新为六项的陆军战斗体能测试(ACFT)。由于缺乏关于 ACFT 最大硬拉(MDL)项目的数据,尤其是在军事人群中,人们对测试的客观性越来越担忧。目前还不清楚评分的可靠性如何,评分员的专业经验是否会影响评分的可靠性,如果是这样,需要什么样的经验水平才能进行可靠的评估。MDL 的表现和评估可能会影响到军人的职业选择、晋升和在陆军中的留用。本研究的目的是确定不同专业经验程度的评分员在评分 MDL 时的组内和组间可靠性,并确定负荷、整体提升成功率、性别和体重指数(BMI)之间的关系。
本研究设计为可靠性研究。获得了朴茨茅斯海军医疗中心机构审查委员会的批准。从美国军事学院招募了 55 名健康的士兵和学员。参与者完成了一次数据收集,进行了一次 MDL 尝试。尝试使用三个设备进行视频记录:两个垂直于矢状面和额状面的手持平板电脑,以 240 Hz 的频率记录;一个数字摄像头以 45°角以 30 Hz 的频率记录。通过对矢状面和额状面记录的慢动作分析建立了参考标准。六名评分员在两次独立的随机顺序的实时速度下查看了 45°摄像头的记录,具有不同程度的专业经验。根据 MDL 标准,将提升成功率分为成功或不成功两种情况进行评估。采用 Cohen's kappa 来确定评分员之间的组内和组间可靠性。采用双变量相关性来评估负荷、BMI 和性别之间的关系。采用卡方检验评估性别与整体提升成功率之间的关系。
六位评分员之间的组间可靠性范围为 0.29 至 0.69。评分员与参考标准之间的组间可靠性范围为 0.28 至 0.61。评分员之间的组内可靠性范围为 0.51 至 0.84。参加过培训和条令司令部批准的 ACFT 认证课程的评分员的组内可靠性范围为 0.51 至 0.66,而没有参加过的评分员的组内可靠性范围为 0.34 至 0.46。BMI 和性别与负荷量呈正相关(r=0.405,P=0.002;r=-0.727,P<0.001)。整体提升成功率与负荷量无关(r=-0.047,P=0.731)。整体提升成功率与性别无关(x2=0.271,P=0.602)。
六位评分员的组间可靠性范围为差到中等,而组内可靠性范围为中等到良好。与参考标准相比,组间可靠性范围为差到中等。本研究中评分员之间和评分员内部表现出的一致性范围很广,这对目前用于评分 MDL 的主观方法提出了质疑。需要进一步研究,以了解可能影响士兵职业发展的 MDL 等绩效标准的最可行、最有效和最可靠的评估方法。