Sjoding Michael W, Cooke Colin R, Iwashyna Theodore J, Hofer Timothy P
1 Department of Internal Medicine and.
2 Institute for Healthcare Policy & Innovation, University of Michigan, Ann Arbor, Michigan.
Ann Am Thorac Soc. 2016 Jul;13(7):1123-8. doi: 10.1513/AnnalsATS.201601-072OC.
Identifying patients with acute respiratory distress syndrome (ARDS) is a recognized challenge. Experts often have only moderate agreement when applying the clinical definition of ARDS to patients. However, no study has fully examined the implications of low reliability measurement of ARDS on clinical studies.
To investigate how the degree of variability in ARDS measurement commonly reported in clinical studies affects study power, the accuracy of treatment effect estimates, and the measured strength of risk factor associations.
We examined the effect of ARDS measurement error in randomized clinical trials (RCTs) of ARDS-specific treatments and cohort studies using simulations. We varied the reliability of ARDS diagnosis, quantified as the interobserver reliability (κ-statistic) between two reviewers. In RCT simulations, patients identified as having ARDS were enrolled, and when measurement error was present, patients without ARDS could be enrolled. In cohort studies, risk factors as potential predictors were analyzed using reviewer-identified ARDS as the outcome variable.
Lower reliability measurement of ARDS during patient enrollment in RCTs seriously degraded study power. Holding effect size constant, the sample size necessary to attain adequate statistical power increased by more than 50% as reliability declined, although the result was sensitive to ARDS prevalence. In a 1,400-patient clinical trial, the sample size necessary to maintain similar statistical power increased to over 1,900 when reliability declined from perfect to substantial (κ = 0.72). Lower reliability measurement diminished the apparent effectiveness of an ARDS-specific treatment from a 15.2% (95% confidence interval, 9.4-20.9%) absolute risk reduction in mortality to 10.9% (95% confidence interval, 4.7-16.2%) when reliability declined to moderate (κ = 0.51). In cohort studies, the effect on risk factor associations was similar.
ARDS measurement error can seriously degrade statistical power and effect size estimates of clinical studies. The reliability of ARDS measurement warrants careful attention in future ARDS clinical studies.
识别急性呼吸窘迫综合征(ARDS)患者是一项公认的挑战。专家在将ARDS的临床定义应用于患者时,意见往往只有中等程度的一致性。然而,尚无研究全面探讨ARDS低可靠性测量对临床研究的影响。
研究临床研究中常见的ARDS测量变异性程度如何影响研究效能、治疗效果估计的准确性以及危险因素关联的测量强度。
我们通过模拟研究了ARDS测量误差在ARDS特异性治疗的随机临床试验(RCT)和队列研究中的影响。我们改变了ARDS诊断的可靠性,以两名评估者之间的观察者间可靠性(κ统计量)进行量化。在RCT模拟中,将被确定为患有ARDS的患者纳入研究,当存在测量误差时,没有ARDS的患者也可能被纳入。在队列研究中,使用评估者确定的ARDS作为结局变量,分析作为潜在预测因素的危险因素。
在RCT患者入组期间,ARDS较低的可靠性测量严重降低了研究效能。在效应量保持不变的情况下,随着可靠性下降,获得足够统计效能所需的样本量增加了50%以上,尽管结果对ARDS患病率敏感。在一项有1400名患者的临床试验中,当可靠性从完美降至中等(κ = 0.72)时,维持相似统计效能所需的样本量增加到超过1900名。较低的可靠性测量使ARDS特异性治疗的明显有效性从死亡率绝对风险降低15.2%(95%置信区间,9.4 - 20.9%)降至可靠性降至中等(κ = 0.51)时的10.9%(95%置信区间,4.7 - 16.2%)。在队列研究中,对危险因素关联的影响类似。
ARDS测量误差可严重降低临床研究的统计效能和效应量估计。在未来的ARDS临床研究中,ARDS测量的可靠性值得仔细关注。