Cooper Katherine M, Ramella Leah, Boama-Nyarko Esther, Rokicki Slawa, Xu Lulu, Masters Grace A, Byatt Nancy, Mackie Thomas I, Sheldrick R Christopher
University of Massachusetts Chan Medical School, UMass Memorial Health, 222 Maple Ave- Chang Building, Shrewsbury, MA, 01545, USA.
School of Public Health, State University of New York, Downstate Health Sciences University, 450 Clarkson Avenue, Brooklyn, NY, 11203, USA.
Adm Policy Ment Health. 2025 Jul 14. doi: 10.1007/s10488-025-01454-x.
To develop screening guidelines, the Grading of Recommendations, Assessment, Development, and Evaluations (GRADE) Evidence to Decision (EtD) framework recommends careful assessment of both test accuracy and the downstream consequences of screening. To tailor recommendations to a specific context, GRADE EtD recommends ensuring that all assumptions and inputs on which the original recommendations are based are appropriate to the novel setting. Perinatal depression screening offers a notable example where evidence-based screening guidelines are recommended at a national level, yet implementation necessarily occurs in specific contexts. Methods to examine the generalizability of assumptions underlying screening recommendations are needed. The GRADE EtD framework demonstrates how local prevalence can be combined with evidence on screening sensitivity and specificity to estimate the number of true positive, false positive, true negative, and false negative results. In turn, these estimates can be linked to evidence of benefit and harm, such as potential benefits from treatment or stigma from false positive identification. To estimate benefit at a local level, we developed a simulation model that expresses prevalence as a function of sensitivity, specificity, and the proportion of patients who screen positive. We then identified published systematic reviews and meta-analyses of (a) perinatal depression prevalence, (b) screening accuracy, (c) implementation of screening in clinical settings. We then used a participatory form of simulation modeling to estimate prevalence at a local level-a necessary first step to evaluation net benefit-and to explore alternative hypotheses through sensitivity analyses. We identified meta-analyses of prevalence and screening accuracy, as well as 14 screening studies with data sufficient to inform key questions. Simulation models estimated local prevalence as a function of positive screening rates and published estimates of sensitivity and specificity. These prevalence estimates displayed marked heterogeneity, including frequent implausible impossible values (e.g., prevalence < 0%). Findings suggest that screening data are insufficient to estimate local prevalence and that sensitivity and specificity are not stable properties of screening questionnaires. Instead, study-level differences in context may be influential, such as variation in patients' willingness to disclose depression symptoms across settings. Results highlight the opportunity for simulation modeling to inform evidence synthesis and decision-making.
为制定筛查指南,推荐分级、评估、制定与评价(GRADE)证据到决策(EtD)框架建议仔细评估检测准确性和筛查的下游后果。为使建议适合特定背景,GRADE EtD建议确保原始建议所基于的所有假设和输入适用于新环境。围产期抑郁症筛查就是一个显著例子,在国家层面推荐基于证据的筛查指南,但实施必然发生在特定背景下。需要有方法来检验筛查建议背后假设的可推广性。GRADE EtD框架展示了如何将当地患病率与筛查敏感性和特异性的证据相结合,以估计真阳性、假阳性、真阴性和假阴性结果的数量。反过来,这些估计值可与益处和危害的证据相联系,比如治疗的潜在益处或假阳性识别带来的污名化。为在当地层面估计益处,我们开发了一个模拟模型,该模型将患病率表示为敏感性、特异性和筛查阳性患者比例的函数。然后,我们确定了已发表的关于(a)围产期抑郁症患病率、(b)筛查准确性、(c)临床环境中筛查实施情况的系统评价和荟萃分析。然后,我们采用参与式模拟建模形式来估计当地层面的患病率——这是评估净效益的必要第一步——并通过敏感性分析探索替代假设。我们确定了患病率和筛查准确性的荟萃分析,以及14项有足够数据可回答关键问题的筛查研究。模拟模型根据阳性筛查率以及已发表的敏感性和特异性估计值来估计当地患病率。这些患病率估计值显示出显著的异质性,包括频繁出现难以置信或不可能的值(例如,患病率<0%)。研究结果表明,筛查数据不足以估计当地患病率,并且敏感性和特异性并非筛查问卷的稳定属性。相反,研究层面的背景差异可能有影响,比如不同环境下患者披露抑郁症状意愿的差异。结果突出了模拟建模为证据综合和决策提供信息的机会。