School of Environmental Engineering, University of Seoul, Seoul 02504, Republic of Korea.
Department of Statistics, University of Seoul, Seoul 02504, Republic of Korea.
J Hazard Mater. 2024 Oct 5;478:135446. doi: 10.1016/j.jhazmat.2024.135446. Epub 2024 Aug 6.
This study aimed to screen the inhalation toxicity of chemicals found in consumer products such as air fresheners, fragrances, and anti-fogging agents submitted to K-REACH using machine learning models. We manually curated inhalation toxicity data based on OECD test guideline 403 (Acute inhalation), 412 (Sub-acute inhalation), and 413 (Sub-chronic inhalation) for 1709 chemicals from the OECD eChemPortal database. Machine learning models were trained using ten algorithms, along with four molecular fingerprints (MACCS, Morgan, Topo, RDKit) and molecular descriptors, achieving F1 scores ranging from 51 % to 91 % in test dataset. Leveraging the high-performing models, we conducted a virtual screening of chemicals, initially applying them to data-rich chemicals generally used in occupational settings to determine the prediction uncertainty. Results showed high sensitivity (75 %) but low specificity (23 %), suggesting that our models can contribute to conservative screening of chemicals. Subsequently, we applied the models to consumer product chemicals, identifying 79 as of high concern. Most of the prioritized chemicals lacked GHS classifications related to inhalation toxicity, even though they were predicted to be used in many consumer products. This study highlights a potential regulatory blind spot concerning the inhalation risk of consumer product chemicals while also indicating the potential of artificial intelligence (AI) models to aid in prioritizing chemicals at the screening level.
本研究旨在使用机器学习模型筛选提交给 K-REACH 的消费品(如空气清新剂、香水和防雾剂)中所含化学物质的吸入毒性。我们根据 OECD 测试指南 403(急性吸入)、412(亚急性吸入)和 413(亚慢性吸入),对来自 OECD eChemPortal 数据库的 1709 种化学物质的吸入毒性数据进行了手动整理。使用十种算法以及四种分子指纹(MACCS、Morgan、Topo、RDKit)和分子描述符对机器学习模型进行了训练,在测试数据集中的 F1 分数范围从 51%到 91%不等。利用高性能模型,我们对化学物质进行了虚拟筛选,最初将其应用于通常在职业环境中使用的数据丰富的化学物质,以确定预测的不确定性。结果表明,模型具有较高的灵敏度(75%)但特异性较低(23%),表明我们的模型有助于对化学物质进行保守筛选。随后,我们将模型应用于消费品化学物质,确定了 79 种高度关注的化学物质。大多数被优先考虑的化学物质缺乏与吸入毒性相关的 GHS 分类,尽管它们被预测将用于许多消费品。本研究突出了消费者产品化学物质吸入风险的潜在监管盲点,同时也表明人工智能(AI)模型在筛选水平上优先考虑化学物质方面具有潜力。