Pitard A, Viel J F
Department of Public Health, Faculty of Medicine, Besançon, France.
Stat Med. 1997 Mar 15;16(5):527-44. doi: 10.1002/(sici)1097-0258(19970315)16:5<527::aid-sim429>3.0.co;2-c.
The aim of this paper is to provide accurate estimation methods for regression models used in epidemiological time series to deduce quantitative morbidity relationships. Such models often include highly correlated variables (pollutant levels and climatic conditions) as well as lagged and unlagged values of the same variables (which also show a high collinearity due to the stochastic dependency of consecutive measurements). We first describe some methods to detect and assess multicollinearity. We recall the drawbacks of usual methods of estimation, and then after briefly mentioning traditional solutions, we explore three alternative methods accounting for multicollinearity: Sclove's estimation; Almon's method; and a combination of Almon's method and principal components procedure. We compare these methods in obtaining efficient estimators on environmental epidemiological data (children's hospital admissions as dependent variable and unlagged and lagged values of outdoor temperature, SO2, NO and CO as explanatory variables.
本文旨在为流行病学时间序列中用于推导定量发病关系的回归模型提供准确的估计方法。此类模型通常包含高度相关的变量(污染物水平和气候条件)以及同一变量的滞后和非滞后值(由于连续测量的随机依赖性,这些值也呈现出高度共线性)。我们首先描述一些检测和评估多重共线性的方法。我们回顾常用估计方法的缺点,然后在简要提及传统解决方案后,探讨三种考虑多重共线性的替代方法:斯克洛维估计法;阿尔蒙方法;以及阿尔蒙方法与主成分法的结合。我们比较这些方法在获取环境流行病学数据有效估计量方面的情况(以儿童医院入院人数为因变量,以室外温度、二氧化硫、一氧化氮和一氧化碳的非滞后和滞后值为解释变量)。