Suppr超能文献

基于二维结构特征的深度学习方法识别潜在的持久性、生物累积性和毒性(PBT/POP)类似化学品。

Identification of Potential PBT/POP-Like Chemicals by a Deep Learning Approach Based on 2D Structural Features.

机构信息

Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China.

Department of Physical and Environmental Sciences, University of Toronto, 1265 Military Trail, Toronto, Ontario Canada, M1C 1A4.

出版信息

Environ Sci Technol. 2020 Jul 7;54(13):8221-8231. doi: 10.1021/acs.est.0c01437. Epub 2020 Jun 16.

Abstract

Identifying potential persistent organic pollutants (POPs) and persistent, bioaccumulative, and toxic (PBT) substances from industrial chemical inventories are essential for chemical risk assessment, management, and pollution control. Inspired by the connections between chemical structures and their properties, a deep convolutional neural network (DCNN) model was developed to screen potential PBT/POP-like chemicals. For each chemical, a two-dimensional molecular descriptor representation matrix based on 2424 molecular descriptors was used as the model input. The DCNN model was trained via a supervised learning algorithm with 1306 PBT/POP-like chemicals and 9990 chemicals currently known as non-POPs/PBTs. The model can achieve an average prediction accuracy of 95.3 ± 0.6% and an F-measurement of 79.3 ± 2.5% for PBT/POP-like chemicals (positive samples only) on external data sets. The DCNN model was further evaluated with 52 experimentally determined PBT chemicals in the REACH PBT assessment list and correctly recognized 47 chemicals as PBT/non-PBT chemicals. The DCNN model yielded a total of 4011 suspected PBT/POP like chemicals from 58 079 chemicals merged from five published industrial chemical lists. The proportions of PBT/POP-like substances in the chemical inventories were 6.9-7.8%, higher than a previous estimate of 3-5%. Although additional PBT/POP chemicals were identified, no new family of PBT/POP-like chemicals was observed.

摘要

从工业化学物质清单中识别潜在的持久性有机污染物 (POPs) 和持久性、生物累积性和毒性 (PBT) 物质对于化学风险评估、管理和污染控制至关重要。受化学结构与其性质之间联系的启发,开发了一种深度卷积神经网络 (DCNN) 模型来筛选潜在的 PBT/POP 类似化学品。对于每种化学物质,使用基于 2424 个分子描述符的二维分子描述符表示矩阵作为模型输入。DCNN 模型通过有监督学习算法进行训练,其中包括 1306 种 PBT/POP 类似化学品和 9990 种目前已知的非 POPs/PBT 化学品。该模型可以在外部数据集上对 PBT/POP 类似化学品(仅阳性样本)实现平均预测准确率为 95.3±0.6%和 F 度量为 79.3±2.5%。该 DCNN 模型进一步用 52 种在 REACH PBT 评估清单中确定的实验性 PBT 化学品进行了评估,正确识别出 47 种化学品为 PBT/非 PBT 化学品。该 DCNN 模型从五个已发表的工业化学物质清单合并的 58079 种化学物质中总共识别出 4011 种疑似 PBT/POP 类似化学物质。化学物质清单中 PBT/POP 类似物质的比例为 6.9-7.8%,高于之前估计的 3-5%。虽然确定了更多的 PBT/POP 化学品,但没有观察到新的 PBT/POP 类似化学品家族。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验