Wu J D, Milton D K, Hammond S K, Spear R C
Center for Occupational and Environmental Health, School of Public Health, University of California, Berkeley 94720, USA.
Ann Occup Hyg. 1999 Jan;43(1):43-55.
The objectives of this study were to explore the application of cluster analysis to the characterization of multiple exposures in industrial hygiene practice and to compare exposure groupings based on the result from cluster analysis with that based on non-measurement-based approaches commonly used in epidemiology. Cluster analysis was performed for 37 workers simultaneously exposed to three agents (endotoxin, phenolic compounds and formaldehyde) in fiberglass insulation manufacturing. Different clustering algorithms, including complete-linkage (or farthest-neighbor), single-linkage (or nearest-neighbor), group-average and model-based clustering approaches, were used to construct the tree structures from which clusters can be formed. Differences were observed between the exposure clusters constructed by these different clustering algorithms. When contrasting the exposure classification based on tree structures with that based on non-measurement-based information, the results indicate that the exposure clusters identified from the tree structures had little in common with the classification results from either the traditional exposure zone or the work group classification approach. In terms of the defining homogeneous exposure groups or from the standpoint of health risk, some toxicological normalization in the components of the exposure vector appears to be required in order to form meaningful exposure groupings from cluster analysis. Finally, it remains important to see if the lack of correspondence between exposure groups based on epidemiological classification and measurement data is a peculiarity of the data or a more general problem in multivariate exposure analysis.
本研究的目的是探讨聚类分析在工业卫生实践中对多种暴露特征描述的应用,并将基于聚类分析结果的暴露分组与基于流行病学中常用的非测量方法的暴露分组进行比较。对玻璃纤维绝缘材料制造中同时暴露于三种物质(内毒素、酚类化合物和甲醛)的37名工人进行了聚类分析。使用了不同的聚类算法,包括完全连锁(或最远邻)、单连锁(或最近邻)、组平均和基于模型的聚类方法来构建可形成聚类的树形结构。观察到这些不同聚类算法构建的暴露聚类之间存在差异。当将基于树形结构的暴露分类与基于非测量信息的暴露分类进行对比时,结果表明,从树形结构中识别出的暴露聚类与传统暴露区域或工作组分类方法的分类结果几乎没有共同之处。从定义同质暴露组或健康风险的角度来看,为了从聚类分析中形成有意义的暴露分组,似乎需要对暴露向量的组成部分进行一些毒理学归一化。最后,基于流行病学分类的暴露组与测量数据之间缺乏对应关系是数据的特殊性还是多变量暴露分析中更普遍的问题,这一点仍然很重要。