Suppr超能文献

在 OSHA 的 IMIS 数据库中推断空气样本中的未检出值的前提条件:预测样本量。

Prerequisite for Imputing Non-detects among Airborne Samples in OSHA's IMIS Databank: Prediction of Sample's Volume.

机构信息

Department of Environmental and Occupational Health, Dornsife School of Public Health, Drexel University, Nesbitt Hall Room 614, 3215 Market Street, Philadelphia, PA 19104, USA.

Chemical and Biological Hazards Prevention, Institut de recherche Robert-Sauvé en santé et en sécurité du travail, Montréal, Québec H3A 3C2, Canada.

出版信息

Ann Work Expo Health. 2023 Jul 6;67(6):744-757. doi: 10.1093/annweh/wxad017.

Abstract

INTRODUCTION

The US Integrated Management Information System (IMIS) contains workplace measurements collected by Occupational Safety and Health Administration (OSHA) inspectors. Its use for research is limited by the lack of record of a value for the limit of detection (LOD) associated with non-detected measurements, which should be used to set censoring point in statistical analysis. We aimed to remedy this by developing a predictive model of the volume of air sampled (V) for the non-detected results of airborne measurements, to then estimate the LOD using the instrument detection limit (IDL), as IDL/V.

METHODS

We obtained the Chemical Exposure Health Data from OSHA's central laboratory in Salt Lake City that partially overlaps IMIS and contains information on V. We used classification and regression trees (CART) to develop a predictive model of V for all measurements where the two datasets overlapped. The analysis was restricted to 69 chemical agents with at least 100 non-detected measurements, and calculated sampling air flow rates consistent with workplace measurement practices; undefined types of inspections were excluded, leaving 412,201/413,515 records. CART models were fitted on randomly selected 70% of the data using 10-fold cross-validation and validated on the remaining data. A separate CART model was fitted to styrene data.

RESULTS

Sampled air volume had a right-skewed distribution with a mean of 357 l, a median (M) of 318, and ranged from 0.040 to 1868 l. There were 173,131 measurements described as non-detects (42% of the data). For the non-detects, the V tended to be greater (M = 378 l) than measurements characterized as either 'short-term' (M = 218 l) or 'long-term' (M = 297 l). The CART models were complex and not easy to interpret, but substance, industry, and year were among the top three most important classifiers. They predicted V well overall (Pearson correlation (r) = 0.73, P < 0.0001; Lin's concordance correlation (rc) = 0.69) and among records captured as non-detects in IMIS (r = 0.66, P < 0.0001l; rc = 0.60). For styrene, CART built on measurements for all agents predicted V among 569 non-detects poorly (r = 0.15; rc = 0.04), but styrene-specific CART predicted it well (r = 0.87, P < 0.0001; rc = 0.86).

DISCUSSION

Among the limitations of our work is the fact that samples may have been collected on different workers and processes within each inspection, each with its own V. Furthermore, we lack measurement-level predictors because classifiers were captured at the inspection level. We did not study all substances that may be of interest and did not use the information that substances measured on the same sampling media should have the same V. We must note that CART models tend to over-fit data and their predictions depend on the selected data, as illustrated by contrasting predictions created using all data vs. limited to styrene.

CONCLUSIONS

We developed predictive models of sampled air volume that should enable the calculation of LOD for non-detects in IMIS. Our predictions may guide future work on handling non-detects in IMIS, although it is advisable to develop separate predictive models for each substance, industry, and year of interest, while also considering other factors, such as whether the measurement evaluated long-term or short-term exposure.

摘要

简介

美国综合管理信息系统(IMIS)包含职业安全与健康管理局(OSHA)检查员收集的工作场所测量数据。由于缺乏与未检出测量结果相关的检测限(LOD)值记录,其用于研究的用途受到限制,该值应用于在统计分析中设置截止点。我们的目标是通过开发一个空气中采样量(V)的预测模型来解决这个问题,该模型用于估计未检出结果的 LOD,然后使用仪器检测限(IDL)来估计 LOD,即 IDL/V。

方法

我们从盐湖城 OSHA 的中央实验室获得了部分与 IMIS 重叠的化学暴露健康数据,其中包含 V 信息。我们使用分类和回归树(CART)为两个数据集重叠的所有测量结果开发了 V 的预测模型。分析仅限于至少有 100 个未检出测量结果的 69 种化学物质,并且计算了与工作场所测量实践一致的采样空气流速;排除了未定义类型的检查,留下了 412,201/413,515 条记录。CART 模型在随机选择的 70%的数据上进行了拟合,并在其余数据上进行了验证。为苯乙烯数据拟合了一个单独的 CART 模型。

结果

采样空气量呈右偏态分布,平均值为 357 l,中位数(M)为 318,范围为 0.040 至 1868 l。有 173,131 次测量被描述为未检出(数据的 42%)。对于未检出的测量结果,V 倾向于较大(M = 378 l),而特征为“短期”(M = 218 l)或“长期”(M = 297 l)的测量结果则较小。CART 模型很复杂,不易解释,但物质、行业和年份是最重要的三个分类器之一。它们总体上预测 V 的效果很好(Pearson 相关系数(r)= 0.73,P < 0.0001;Lin 的一致性相关系数(rc)= 0.69),并且在 IMIS 中捕获为未检出的记录中效果也很好(r = 0.66,P < 0.0001;rc = 0.60)。对于苯乙烯,基于所有物质的测量结果建立的 CART 对 569 个未检出结果的预测效果不佳(r = 0.15;rc = 0.04),但苯乙烯专用 CART 预测效果很好(r = 0.87,P < 0.0001;rc = 0.86)。

讨论

我们工作的局限性之一是,在每次检查中,样本可能是在不同的工人和过程中采集的,每个样本都有自己的 V。此外,我们缺乏测量级别的预测因子,因为分类器是在检查级别捕获的。我们没有研究所有可能感兴趣的物质,也没有使用同一采样介质上测量的物质应该具有相同 V 的信息。我们必须注意,CART 模型往往会过度拟合数据,其预测结果取决于所选数据,这从使用所有数据和仅限于苯乙烯的预测结果对比中可以看出。

结论

我们开发了空气中采样量的预测模型,这应该能够计算 IMIS 中未检出结果的 LOD。我们的预测结果可能有助于指导未来在 IMIS 中处理未检出结果的工作,尽管建议为每个感兴趣的物质、行业和年份开发单独的预测模型,同时还要考虑其他因素,例如评估长期或短期暴露的测量。

相似文献

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验