Nassar Elsa-Lynn, Levis Brooke, Neyer Marieke A, Rice Danielle B, Booij Linda, Benedetti Andrea, Thombs Brett D
Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada.
Department of Psychiatry, McGill University, Montreal, Quebec, Canada.
Int J Methods Psychiatr Res. 2022 Jun;31(2):e1910. doi: 10.1002/mpr.1910. Epub 2022 Apr 1.
Depression screening tool accuracy studies should be conducted with large enough sample sizes to generate precise accuracy estimates. We assessed the proportion of recently published depression screening tool diagnostic accuracy studies that reported sample size calculations; the proportion that provided confidence intervals (CIs); and precision, based on the width and lower bounds of 95% CIs for sensitivity and specificity. In addition, we assessed whether these results have improved since a previous review of studies published in 2013-2015.
MEDLINE was searched from January 1, 2018, through May 21, 2021.
Twelve of 106 primary studies (11%) described a viable sample size calculation, which represented an improvement of 8% since the last review. Thirty-six studies (34%) provided reasonably accurate CIs. Of 103 studies where 95% CIs were provided or could be calculated, seven (7%) had sensitivity CI widths of ≤10%, whereas 58 (56%) had widths of ≥21%. Eighty-four studies (82%) had lower bounds of CIs <80% for sensitivity and 77 studies (75%) for specificity. These results were similar to those reported previously.
Few studies reported sample size calculations, and the number of included individuals in most studies was too small to generate reasonably precise accuracy estimates.
抑郁症筛查工具准确性研究应采用足够大的样本量进行,以得出精确的准确性估计值。我们评估了近期发表的抑郁症筛查工具诊断准确性研究中报告样本量计算的比例;提供置信区间(CI)的比例;以及基于敏感性和特异性的95%CI的宽度和下限的精确性。此外,我们评估了自之前对2013 - 2015年发表的研究进行综述以来,这些结果是否有所改善。
检索MEDLINE数据库,时间范围为2018年1月1日至2021年5月21日。
106项主要研究中有12项(11%)描述了可行的样本量计算,自上次综述以来提高了8%。36项研究(34%)提供了合理准确的CI。在103项提供或可计算95%CI的研究中,7项(7%)的敏感性CI宽度≤10%,而58项(56%)的宽度≥21%。84项研究(82%)的敏感性CI下限<80%,77项研究(75%)的特异性CI下限<80%。这些结果与之前报告的结果相似。
很少有研究报告样本量计算,大多数研究纳入的个体数量太少,无法得出合理精确的准确性估计值。