Borders James C, Grande Alessandro A, Barbon Carly E A, Hutcheson Katherine A, Troche Michelle S
Laboratory for the Study of Upper Airway Dysfunction, Department of Biobehavioral Sciences, Teachers College, Columbia University, New York, NY, USA.
Department of Statistics, Columbia University, New York, NY, USA.
Dysphagia. 2025 Apr;40(2):388-398. doi: 10.1007/s00455-024-10738-7. Epub 2024 Aug 17.
Multiple bolus trials are administered during clinical and research swallowing assessments to comprehensively capture an individual's swallowing function. Despite valuable information obtained from these boluses, it remains common practice to use a single bolus (e.g., the worst score) to describe the degree of dysfunction. Researchers also often collapse continuous or ordinal swallowing measures into categories, potentially exacerbating information loss. These practices may adversely affect statistical power to detect and estimate smaller, yet potentially meaningful, treatment effects. This study sought to examine the impact of aggregating and categorizing penetration-aspiration scale (PAS) scores on statistical power and effect size estimates. We used a Monte Carlo approach to simulate three hypothetical within-subject treatment studies in Parkinson's disease and head and neck cancer across a range of data characteristics (e.g., sample size, number of bolus trials, variability). Different statistical models (aggregated or multilevel) as well as various PAS reduction approaches (i.e., types of categorizations) were performed to examine their impact on power and the accuracy of effect size estimates. Across all scenarios, multilevel models demonstrated higher statistical power to detect group-level longitudinal change and more accurate estimates compared to aggregated (worst score) models. Categorizing PAS scores also reduced power and biased effect size estimates compared to an ordinal approach, though this depended on the type of categorization and baseline PAS distribution. Multilevel models should be considered as a more robust approach for the statistical analysis of multiple boluses administered in standardized swallowing protocols due to its high sensitivity and accuracy to compare group-level changes in swallowing function. Importantly, this finding appears to be consistent across patient populations with distinct pathophysiology (i.e., PD and HNC) and patterns of airway invasion. The decision to categorize a continuous or ordinal outcome should be grounded in the clinical or research question with recognition that scale reduction may negatively affect the quality of statistical inferences in certain scenarios.
在临床和研究性吞咽评估期间会进行多次推注试验,以全面了解个体的吞咽功能。尽管从这些推注中获得了有价值的信息,但使用单一推注(例如最差分数)来描述功能障碍程度仍是常见做法。研究人员还经常将连续或有序的吞咽测量结果归为几类,这可能会加剧信息丢失。这些做法可能会对检测和估计较小但可能有意义的治疗效果的统计功效产生不利影响。本研究旨在探讨对渗透 - 误吸量表(PAS)分数进行汇总和分类对统计功效和效应大小估计的影响。我们采用蒙特卡罗方法,针对帕金森病和头颈癌的一系列数据特征(例如样本量、推注试验次数、变异性)模拟了三项假设的受试者内治疗研究。采用不同的统计模型(汇总或多层次)以及各种PAS简化方法(即分类类型)来检验它们对功效以及效应大小估计准确性的影响。在所有情况下,与汇总(最差分数)模型相比,多层次模型在检测组水平纵向变化方面具有更高的统计功效,并且估计更准确。与有序方法相比,对PAS分数进行分类也会降低功效并使效应大小估计产生偏差,不过这取决于分类类型和基线PAS分布。由于多层次模型在比较吞咽功能的组水平变化方面具有高敏感性和准确性,因此应将其视为对标准化吞咽方案中多次推注进行统计分析的更稳健方法。重要的是,这一发现似乎在具有不同病理生理学(即帕金森病和头颈癌)以及气道侵犯模式的患者群体中是一致的。对于连续或有序结果进行分类的决定应基于临床或研究问题,并认识到在某些情况下量表简化可能会对统计推断的质量产生负面影响。