Manyara Anthony Muchai, Purvis Anthony, Ciani Oriana, Collins Gary S, Taylor Rod S
School of Health and Wellbeing, University of Glasgow, Glasgow, UK; Global Health and Ageing Research Unit, Bristol Medical School, University of Bristol, Bristol, UK.
School of Health and Wellbeing, University of Glasgow, Glasgow, UK.
J Clin Epidemiol. 2024 Oct;174:111485. doi: 10.1016/j.jclinepi.2024.111485. Epub 2024 Jul 26.
BACKGROUND AND OBJECTIVE: The minimum sample size for multistakeholder Delphi surveys remains understudied. Drawing from three large international multistakeholder Delphi surveys, this study aimed to: 1) investigate the effect of increasing sample size on replicability of results; 2) assess whether the level of replicability of results differed with participant characteristics: for example, gender, age, and profession. METHODS: We used data from Delphi surveys to develop guidance for improved reporting of health-care intervention trials: SPIRIT (Standard Protocol Items: Recommendations for Interventional Trials) and CONSORT (Consolidated Standards of Reporting Trials) extension for surrogate end points (n = 175, 22 items rated); CONSORT-SPI [CONSORT extension for Social and Psychological Interventions] (n = 333, 77 items rated); and core outcome set for burn care (n = 553, 88 items rated). Resampling with replacement was used to draw random subsamples from the participant data set in each of the three surveys. For each subsample, the median value of all rated survey items was calculated and compared to the medians from the full participant data set. The median number (and interquartile range) of medians replicated was used to calculate the percentage replicability (and variability). High replicability was defined as ≥80% and moderate as 60% and <80% RESULTS: The average median replicability (variability) as a percentage of total number of items rated from the three datasets was 81% (10%) at a sample size of 60. In one of the datasets (CONSORT-SPI), a ≥80% replicability was reached at a sample size of 80. On average, increasing the sample size from 80 to 160 increased the replicability of results by a further 3% and reduced variability by 1%. For subgroup analysis based on participant characteristics (eg, gender, age, professional role), using resampled samples of 20 to 100 showed that a sample size of 20 to 30 resulted to moderate replicability levels of 64% to 77%. CONCLUSION: We found that a minimum sample size of 60-80 participants in multistakeholder Delphi surveys provides a high level of replicability (≥80%) in the results. For Delphi studies limited to individual stakeholder groups (such as researchers, clinicians, patients), a sample size of 20 to 30 per group may be sufficient.
背景与目的:多利益相关方德尔菲调查的最小样本量仍未得到充分研究。本研究基于三项大型国际多利益相关方德尔菲调查,旨在:1)调查样本量增加对结果可重复性的影响;2)评估结果的可重复性水平是否因参与者特征(如性别、年龄和职业)而异。 方法:我们使用德尔菲调查的数据来制定改善医疗保健干预试验报告的指南:用于替代终点的SPIRIT(标准方案项目:干预试验建议)和CONSORT(报告试验的统一标准)扩展版(n = 175,对22个项目进行评分);CONSORT - SPI [社会和心理干预的CONSORT扩展版](n = 333,对77个项目进行评分);以及烧伤护理核心结局集(n = 553,对88个项目进行评分)。采用有放回重抽样的方法从三项调查中的每项参与者数据集中抽取随机子样本。对于每个子样本,计算所有评分调查项目的中位数,并与完整参与者数据集的中位数进行比较。复制的中位数的中位数数量(及四分位距)用于计算可重复性百分比(及变异性)。高可重复性定义为≥80%,中等可重复性定义为60%且<80%。结果:在样本量为60时,来自三个数据集的所有评分项目总数的平均中位数可重复性(变异性)百分比为81%(10%)。在其中一个数据集(CONSORT - SPI)中,样本量为80时达到了≥80%的可重复性。平均而言,样本量从80增加到160,结果的可重复性进一步提高了3%,变异性降低了1%。对于基于参与者特征(如性别、年龄、职业角色)的亚组分析,使用20至100的重采样样本表明,样本量为20至30时,可重复性水平中等,为64%至77%。 结论:我们发现,多利益相关方德尔菲调查中至少60 - 80名参与者的样本量可使结果具有较高水平的可重复性(≥80%)。对于仅限于单个利益相关方群体(如研究人员、临床医生、患者)的德尔菲研究,每组20至30的样本量可能就足够了。
Health Technol Assess. 2001
Cochrane Database Syst Rev. 2024-12-16
Cochrane Database Syst Rev. 2020-10-19
Cochrane Database Syst Rev. 2025-3-25
Cochrane Database Syst Rev. 2022-5-20
BMJ Public Health. 2025-1-25
BMC Med Educ. 2025-7-29