Concato J, Feinstein A R, Holford T R
Yale University School of Medicine, New Haven, Connecticut.
Ann Intern Med. 1993 Feb 1;118(3):201-10. doi: 10.7326/0003-4819-118-3-199302010-00009.
To review the principles of multivariable analysis and to examine the application of multivariable statistical methods in general medical literature.
A computer-assisted search of articles in The Lancet and The New England Journal of Medicine identified 451 publications containing multivariable methods from 1985 through 1989. A random sample of 60 articles that used the two most common methods--logistic regression or proportional hazards analysis--was selected for more intensive review.
During review of the 60 randomly selected articles, the focus was on generally accepted methodologic guidelines that can prevent problems affecting the accuracy and interpretation of multivariable analytic results.
From 1985 to 1989, the relative frequency of multivariable statistical methods increased annually from about 10% to 18% among all articles in the two journals. In 44 (73%) of 60 articles using logistic or proportional hazards regression, risk estimates were quantified for individual variables ("risk factors"). Violations and omissions of methodologic guidelines in these 44 articles included overfitting of data; no test of conformity of variables to a linear gradient; no mention of pertinent checks for proportional hazards; no report of testing for interactions between independent variables; and unspecified coding or selection of independent variables. These problems would make the reported results potentially inaccurate, misleading, or difficult to interpret.
The findings suggest a need for improvement in the reporting and perhaps conducting of multivariable analyses in medical research.
回顾多变量分析的原理,并研究多变量统计方法在一般医学文献中的应用。
通过计算机辅助检索《柳叶刀》和《新英格兰医学杂志》上的文章,确定了1985年至1989年期间包含多变量方法的451篇出版物。随机抽取了60篇使用两种最常用方法——逻辑回归或比例风险分析——的文章进行更深入的审查。
在对随机抽取的60篇文章进行审查时,重点关注了普遍接受的方法学指南,这些指南可以防止出现影响多变量分析结果准确性和解释的问题。
1985年至1989年期间,在这两种期刊的所有文章中,多变量统计方法的相对频率每年从约10%增至18%。在60篇使用逻辑回归或比例风险回归的文章中,有44篇(73%)对单个变量(“风险因素”)的风险估计进行了量化。这44篇文章中方法学指南的违反和遗漏情况包括数据过度拟合;未对变量与线性梯度的一致性进行检验;未提及比例风险的相关检验;未报告对自变量之间相互作用的检验;以及自变量的编码或选择未明确说明。这些问题可能会使报告的结果不准确、产生误导或难以解释。
研究结果表明,医学研究中多变量分析的报告方式可能需要改进,甚至多变量分析的实施方式也可能需要改进。