McGee D, Reed D, Yano K
J Chronic Dis. 1984;37(9-10):713-9. doi: 10.1016/0021-9681(84)90040-7.
We provide an example which includes highly correlated variables in multivariate logistic analyses relating four nutrients to the incidence of coronary heart disease in 10 years for over 7000 men. Paradoxical results occur for both the inferences to be drawn and variable selection. Different models exist which show that any particular variable both is and is not related significantly to coronary heart disease incidence, and step-up and step-down variable selection algorithms provide drastically different results.
我们给出一个例子,在对7000多名男性进行的多变量逻辑分析中,该例子包含了与10年内冠心病发病率相关的高度相关变量。在推理得出和变量选择方面都出现了矛盾的结果。存在不同的模型,这些模型表明任何一个特定变量与冠心病发病率既显著相关又不显著相关,而且逐步增加和逐步减少变量选择算法会得出截然不同的结果。