Scales Charles D, Norris Regina D, Peterson Bercedis L, Preminger Glenn M, Dahm Philipp
Department of Surgery (Division of Urology), Duke University Medical Center, Durham, North Carolina 27710, USA.
J Urol. 2005 Oct;174(4 Pt 1):1374-9. doi: 10.1097/01.ju.0000173640.91654.b5.
We provide a systematic assessment of the quality and accuracy of statistical reporting in the urology literature.
All original research publications with adult human subjects in a single issue (August 2004) of 4 leading urology journals were identified for formal review. A standardized evaluation form was developed in consultation with an experienced biostatistician and subsequently tested. Two independent reviewers with at least 1 year of formal training in research design and biostatistics who were blinded to authors and institutions reviewed each article. Discrepancies were settled by consensus and/or adjudication by the biostatistician.
Of the 169 articles screened 97 met eligibility criteria for review. Cohort (43 of 97 or 44%) or cross-sectional (28 of 97 or 29%) designs comprised the majority of these studies. Only 10 randomized clinical trials (12.4%) were identified. Statistical tests were identified in 83 studies (93%). Overall 69 of 83 studies (71%) providing statistical comparisons had at least 1 statistical error, including using the wrong test for the data type in 28%, inappropriate use of a parametric test in 22% and failure to account for multiple comparisons in 65%. In studies applying multivariate analysis (29%) over fitting the model with too many variables was the most common statistical flaw (39%).
This formal review suggests that statistical methods are often used inappropriately in the urology literature, thereby, potentially undermining the validity of study results and conclusions. An effort to raise the awareness of appropriate statistical techniques through postgraduate education appears indicated.
我们对泌尿外科文献中统计报告的质量和准确性进行了系统评估。
确定4种主要泌尿外科杂志某一期(2004年8月)中所有涉及成年人类受试者的原创性研究出版物进行正式审查。与一位经验丰富的生物统计学家协商制定了标准化评估表,随后进行了测试。两位在研究设计和生物统计学方面至少接受过1年正规培训且对作者和机构不知情的独立评审员对每篇文章进行评审。分歧通过协商一致和/或由生物统计学家裁决解决。
在筛选的169篇文章中,97篇符合审查资格标准。这些研究大多采用队列研究(97篇中的43篇,占44%)或横断面研究(97篇中的28篇,占29%)设计。仅识别出10项随机临床试验(12.4%)。83项研究(93%)中确定了统计检验。总体而言,在进行统计比较的83项研究中,有69项(71%)至少存在1个统计错误,包括28%的数据类型使用错误检验、22%的参数检验使用不当以及65%的未考虑多重比较。在应用多变量分析的研究(29%)中,模型变量过多的过度拟合是最常见的统计缺陷(39%)。
这项正式审查表明,泌尿外科文献中统计方法的使用常常不当,从而可能损害研究结果和结论的有效性。似乎有必要通过研究生教育提高对适当统计技术的认识。