Bouyer J, Hémon D
Department of Epidemiological and Statistical Research on Environment and Health, Institut National de la Santé et de la Recherche Médicale (INSERM), Unit 170, Paris, France.
Am J Epidemiol. 1993 Feb 15;137(4):472-81. doi: 10.1093/oxfordjournals.aje.a116696.
A job exposure matrix consists of jobs on one axis and substances on the other, with the matrix elements describing the likelihood of an individual's exposure to a substance in a given job. This can be used in case-control studies to infer exposures of subjects whose jobs are known. The simplest form of job exposure matrix contains binary entries, but it is also possible to envisage continuous variables describing the probability of exposure in the job (probabilistic matrix). In such a case, the user has various options for transforming and analyzing the data, including the following: 1) transform to binary variables and analyze as conventional binary exposure variables; 2) leave as continuous variables and analyze using logistic regression; 3) leave as continuous variables and analyze using a linear model. Simulations were carried out to compare the ability of the three methods to estimate odds ratios under 36 experimental conditions. The linear model produced unbiased estimates, the logistic model produced somewhat biased estimates at high odds ratios, and the transformation to a binary variable produced systematically low estimates in most experimental circumstances. With the linear and logistic models, the odds ratio estimators had similar precision when the bias of the latter was not too great. The authors conclude that the linear model permits optimal use of a probabilistic matrix in an epidemiologic study and hope that these results will encourage the development of job exposure matrices containing probabilities rather than dichotomies.
工作暴露矩阵的一个轴表示工作,另一个轴表示物质,矩阵元素描述了个体在特定工作中接触某种物质的可能性。这可用于病例对照研究,以推断已知工作的受试者的暴露情况。工作暴露矩阵的最简单形式包含二元条目,但也可以设想用连续变量来描述工作中暴露的概率(概率矩阵)。在这种情况下,用户有多种数据转换和分析选项,包括以下几种:1)转换为二元变量并作为传统二元暴露变量进行分析;2)保持为连续变量并使用逻辑回归进行分析;3)保持为连续变量并使用线性模型进行分析。进行了模拟,以比较这三种方法在36种实验条件下估计比值比的能力。线性模型产生无偏估计,逻辑模型在高比值比时产生一定程度的偏倚估计,在大多数实验情况下,转换为二元变量会产生系统性的低估计。对于线性模型和逻辑模型,当逻辑模型的偏倚不太严重时,比值比估计器具有相似的精度。作者得出结论,线性模型允许在流行病学研究中最佳地使用概率矩阵,并希望这些结果将鼓励开发包含概率而非二分法的工作暴露矩阵。