使用层次聚类模型系统地识别具有相似职业问卷回答模式的工作群组，以协助基于人群的研究中基于规则的专家暴露评估。

Using hierarchical cluster models to systematically identify groups of jobs with similar occupational questionnaire response patterns to assist rule-based expert exposure assessment in population-based studies.

作者信息

Friesen Melissa C, Shortreed Susan M, Wheeler David C, Burstyn Igor, Vermeulen Roel, Pronk Anjoeka, Colt Joanne S, Baris Dalsu, Karagas Margaret R, Schwenn Molly, Johnson Alison, Armenti Karla R, Silverman Debra T, Yu Kai

机构信息

1.Occupational and Environmental Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, MD 20892, USA

2.Biostatistics, Group Health Research Institute, Seattle, WA 98101-1448, USA.

出版信息

Ann Occup Hyg. 2015 May;59(4):455-66. doi: 10.1093/annhyg/meu101. Epub 2014 Dec 3.

DOI:10.1093/annhyg/meu101

PMID:25477475

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4385262/

Abstract

OBJECTIVES

Rule-based expert exposure assessment based on questionnaire response patterns in population-based studies improves the transparency of the decisions. The number of unique response patterns, however, can be nearly equal to the number of jobs. An expert may reduce the number of patterns that need assessment using expert opinion, but each expert may identify different patterns of responses that identify an exposure scenario. Here, hierarchical clustering methods are proposed as a systematic data reduction step to reproducibly identify similar questionnaire response patterns prior to obtaining expert estimates. As a proof-of-concept, we used hierarchical clustering methods to identify groups of jobs (clusters) with similar responses to diesel exhaust-related questions and then evaluated whether the jobs within a cluster had similar (previously assessed) estimates of occupational diesel exhaust exposure.

METHODS

Using the New England Bladder Cancer Study as a case study, we applied hierarchical cluster models to the diesel-related variables extracted from the occupational history and job- and industry-specific questionnaires (modules). Cluster models were separately developed for two subsets: (i) 5395 jobs with ≥1 variable extracted from the occupational history indicating a potential diesel exposure scenario, but without a module with diesel-related questions; and (ii) 5929 jobs with both occupational history and module responses to diesel-relevant questions. For each subset, we varied the numbers of clusters extracted from the cluster tree developed for each model from 100 to 1000 groups of jobs. Using previously made estimates of the probability (ordinal), intensity (µg m(-3) respirable elemental carbon), and frequency (hours per week) of occupational exposure to diesel exhaust, we examined the similarity of the exposure estimates for jobs within the same cluster in two ways. First, the clusters' homogeneity (defined as >75% with the same estimate) was examined compared to a dichotomized probability estimate (<5 versus ≥5%; <50 versus ≥50%). Second, for the ordinal probability metric and continuous intensity and frequency metrics, we calculated the intraclass correlation coefficients (ICCs) between each job's estimate and the mean estimate for all jobs within the cluster.

RESULTS

Within-cluster homogeneity increased when more clusters were used. For example, ≥80% of the clusters were homogeneous when 500 clusters were used. Similarly, ICCs were generally above 0.7 when ≥200 clusters were used, indicating minimal within-cluster variability. The most within-cluster variability was observed for the frequency metric (ICCs from 0.4 to 0.8). We estimated that using an expert to assign exposure at the cluster-level assignment and then to review each job in non-homogeneous clusters would require ~2000 decisions per expert, in contrast to evaluating 4255 unique questionnaire patterns or 14983 individual jobs.

CONCLUSIONS

This proof-of-concept shows that using cluster models as a data reduction step to identify jobs with similar response patterns prior to obtaining expert ratings has the potential to aid rule-based assessment by systematically reducing the number of exposure decisions needed. While promising, additional research is needed to quantify the actual reduction in exposure decisions and the resulting homogeneity of exposure estimates within clusters for an exposure assessment effort that obtains cluster-level expert assessments as part of the assessment process.

摘要

目的

在基于人群的研究中，基于问卷回答模式的基于规则的专家暴露评估提高了决策的透明度。然而，独特回答模式的数量可能几乎与工作岗位的数量相等。专家可以使用专业意见减少需要评估的模式数量，但每个专家可能会识别出不同的回答模式来确定暴露情况。在此，提出层次聚类方法作为一种系统的数据简化步骤，以便在获得专家估计之前可重复地识别相似的问卷回答模式。作为概念验证，我们使用层次聚类方法识别对柴油废气相关问题有相似回答的工作岗位组（聚类），然后评估聚类中的工作岗位是否具有相似的（先前评估的）职业柴油废气暴露估计。

方法

以新英格兰膀胱癌研究为例，我们将层次聚类模型应用于从职业史以及特定工作和行业问卷（模块）中提取的与柴油相关的变量。针对两个子集分别开发聚类模型：（i）从职业史中提取≥1个变量表明存在潜在柴油暴露情况但没有与柴油相关问题模块的5395个工作岗位；（ii）既有职业史又有对柴油相关问题的模块回答的5929个工作岗位。对于每个子集，我们将从为每个模型开发的聚类树中提取的聚类数量从100组工作岗位变化到1000组工作岗位。利用先前对职业暴露于柴油废气的概率（序数）、强度（每立方米可吸入元素碳微克数）和频率（每周小时数）的估计，我们通过两种方式检查同一聚类中工作岗位的暴露估计的相似性。首先，与二分概率估计（<5对≥5%；<50对≥50%）相比，检查聚类的同质性（定义为>75%具有相同估计）。其次，对于序数概率指标以及连续强度和频率指标，我们计算每个工作岗位的估计与聚类中所有工作岗位的平均估计之间的组内相关系数（ICC）。

结果

使用更多聚类时，聚类内同质性增加。例如，使用500个聚类时，≥80%的聚类是同质的。同样，当使用≥200个聚类时，ICC通常高于0.7，表明聚类内变异性最小。频率指标观察到的聚类内变异性最大（ICC为0.4至0.8）。我们估计，让专家在聚类级别进行暴露分配然后审查非同质聚类中的每个工作岗位，每位专家大约需要做出2000个决策，相比之下，评估4255种独特的问卷模式或14983个单个工作岗位。

结论

这一概念验证表明，在获得专家评级之前，使用聚类模型作为数据简化步骤来识别具有相似回答模式的工作岗位，有可能通过系统地减少所需的暴露决策数量来辅助基于规则的评估。虽然前景乐观，但需要进一步研究来量化暴露决策的实际减少量以及在将聚类级别专家评估作为评估过程一部分的暴露评估工作中聚类内暴露估计的同质性。

相似文献

Using hierarchical cluster models to systematically identify groups of jobs with similar occupational questionnaire response patterns to assist rule-based expert exposure assessment in population-based studies.

Ann Occup Hyg. 2015 May;59(4):455-66. doi: 10.1093/annhyg/meu101. Epub 2014 Dec 3.

Combining Decision Rules from Classification Tree Models and Expert Assessment to Estimate Occupational Exposure to Diesel Exhaust for a Case-Control Study.

Ann Occup Hyg. 2016 May;60(4):467-78. doi: 10.1093/annhyg/mev095. Epub 2016 Jan 4.

Comparison of ordinal and nominal classification trees to predict ordinal expert-based occupational exposure estimates in a case-control study.

Ann Occup Hyg. 2015 Apr;59(3):324-35. doi: 10.1093/annhyg/meu098. Epub 2014 Nov 27.

Comparison of two expert-based assessments of diesel exhaust exposure in a case-control study: programmable decision rules versus expert review of individual jobs.

Occup Environ Med. 2012 Oct;69(10):752-8. doi: 10.1136/oemed-2011-100524. Epub 2012 Jul 27.

Comparison of algorithm-based estimates of occupational diesel exhaust exposure to those of multiple independent raters in a population-based case-control study.

Ann Occup Hyg. 2013 May;57(4):470-81. doi: 10.1093/annhyg/mes082. Epub 2012 Nov 25.

Diesel Exhaust Exposure Assessment Among Tunnel Construction Workers-Correlations Between Nitrogen Dioxide, Respirable Elemental Carbon, and Particle Number.

Ann Work Expo Health. 2017 Jun 1;61(5):539-553. doi: 10.1093/annweh/wxx024.

Inside the black box: starting to uncover the underlying decision rules used in a one-by-one expert assessment of occupational exposure in case-control studies.

Occup Environ Med. 2013 Mar;70(3):203-10. doi: 10.1136/oemed-2012-100918. Epub 2012 Nov 15.

Estimation of Source-Specific Occupational Benzene Exposure in a Population-Based Case-Control Study of Non-Hodgkin Lymphoma.

Ann Work Expo Health. 2019 Oct 11;63(8):842-855. doi: 10.1093/annweh/wxz063.

Diesel exhaust exposures in port workers.

J Occup Environ Hyg. 2016 Jul;13(7):549-57. doi: 10.1080/15459624.2016.1153802.

Rule-based exposure assessment versus case-by-case expert assessment using the same information in a community-based study.

Occup Environ Med. 2014 Mar;71(3):215-9. doi: 10.1136/oemed-2013-101699. Epub 2013 Nov 12.

引用本文的文献

Occupational Exposure Patterns to Disinfectants and Cleaning Products and Its Association With Asthma Among French Healthcare Workers.

Am J Ind Med. 2025 Jun;68(6):516-530. doi: 10.1002/ajim.23725. Epub 2025 Apr 23.

Testing and Validating Semi-automated Approaches to the Occupational Exposure Assessment of Polycyclic Aromatic Hydrocarbons.

Ann Work Expo Health. 2021 Jul 3;65(6):682-693. doi: 10.1093/annweh/wxab002.

Associations of Metrics of Peak Inhalation Exposure and Skin Exposure Indices With Beryllium Sensitization at a Beryllium Manufacturing Facility.

Ann Work Expo Health. 2019 Oct 11;63(8):856-869. doi: 10.1093/annweh/wxz064.

Using Decision Rules to Assess Occupational Exposure in Population-Based Studies.

Curr Environ Health Rep. 2019 Sep;6(3):148-159. doi: 10.1007/s40572-019-00240-w.

Use and Reliability of Exposure Assessment Methods in Occupational Case-Control Studies in the General Population: Past, Present, and Future.

Ann Work Expo Health. 2018 Nov 12;62(9):1047-1063. doi: 10.1093/annweh/wxy080.

Exposures to Volatile Organic Compounds among Healthcare Workers: Modeling the Effects of Cleaning Tasks and Product Use.

Ann Work Expo Health. 2018 Aug 13;62(7):852-870. doi: 10.1093/annweh/wxy055.

Current asthma and asthma-like symptoms among workers at a Veterans Administration Medical Center.

Int J Hyg Environ Health. 2017 Nov;220(8):1325-1332. doi: 10.1016/j.ijheh.2017.09.001. Epub 2017 Sep 5.

本文引用的文献

Developing estimates of frequency and intensity of exposure to three types of metalworking fluids in a population-based case-control study of bladder cancer.

Am J Ind Med. 2014 Aug;57(8):915-27. doi: 10.1002/ajim.22328.

Rule-based exposure assessment versus case-by-case expert assessment using the same information in a community-based study.

Occup Environ Med. 2014 Mar;71(3):215-9. doi: 10.1136/oemed-2013-101699. Epub 2013 Nov 12.

Estimated prevalence of exposure to occupational carcinogens in Australia (2011-2012).

Occup Environ Med. 2014 Jan;71(1):55-62. doi: 10.1136/oemed-2013-101651. Epub 2013 Oct 24.

Genomic profiling of oral squamous cell carcinoma by array-based comparative genomic hybridization.

PLoS One. 2013;8(2):e56165. doi: 10.1371/journal.pone.0056165. Epub 2013 Feb 14.

Searching remote homology with spectral clustering with symmetry in neighborhood cluster kernels.

PLoS One. 2013;8(2):e46468. doi: 10.1371/journal.pone.0046468. Epub 2013 Feb 15.

Comparison of algorithm-based estimates of occupational diesel exhaust exposure to those of multiple independent raters in a population-based case-control study.

Ann Occup Hyg. 2013 May;57(4):470-81. doi: 10.1093/annhyg/mes082. Epub 2012 Nov 25.

Inside the black box: starting to uncover the underlying decision rules used in a one-by-one expert assessment of occupational exposure in case-control studies.

Occup Environ Med. 2013 Mar;70(3):203-10. doi: 10.1136/oemed-2012-100918. Epub 2012 Nov 15.

Comparison of two expert-based assessments of diesel exhaust exposure in a case-control study: programmable decision rules versus expert review of individual jobs.

Occup Environ Med. 2012 Oct;69(10):752-8. doi: 10.1136/oemed-2011-100524. Epub 2012 Jul 27.

Sharing the knowledge gained from occupational cohort studies: a call for action.

Occup Environ Med. 2012 Jun;69(6):444-8. doi: 10.1136/oemed-2011-100305. Epub 2012 Jan 2.

A case-control study of occupational exposure to trichloroethylene and non-Hodgkin lymphoma.

Environ Health Perspect. 2011 Feb;119(2):232-8. doi: 10.1289/ehp.1002106.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用层次聚类模型系统地识别具有相似职业问卷回答模式的工作群组，以协助基于人群的研究中基于规则的专家暴露评估。

Using hierarchical cluster models to systematically identify groups of jobs with similar occupational questionnaire response patterns to assist rule-based expert exposure assessment in population-based studies.

作者信息

机构信息

出版信息

OBJECTIVES

METHODS

RESULTS

CONCLUSIONS

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献