Suppr超能文献

基于图注意力网络和加权损失函数的迁移学习用于持久性、生物累积性、迁移性和有毒化学品的筛选

Transfer Learning with a Graph Attention Network and Weighted Loss Function for Screening of Persistent, Bioaccumulative, Mobile, and Toxic Chemicals.

作者信息

Wang Haobo, Liu Wenjia, Chen Jingwen, Ji Shengshe

机构信息

Key Laboratory of Industrial Ecology and Environmental Engineering (Ministry of Education), Dalian Key Laboratory on Chemicals Risk Control and Pollution Prevention Technology, School of Environmental Science and Technology, Dalian University of Technology, Dalian 116024, China.

出版信息

Environ Sci Technol. 2025 Jan 14;59(1):578-590. doi: 10.1021/acs.est.4c11085. Epub 2024 Dec 16.

Abstract

methods for screening hazardous chemicals are necessary for sound management. Persistent, bioaccumulative, mobile, and toxic (PBMT) chemicals persist in the environment and have high mobility in aquatic environments, posing risks to human and ecological health. However, lack of experimental data for the vast number of chemicals hinders identification of PBMT chemicals. Through an extensive search of measured chemical mobility data, as well as persistent, bioaccumulative, and toxic (PBT) chemical inventories, this study constructed comprehensive data sets on PBMT chemicals. To address the limited volume of the PBMT chemical data set, a transfer learning (TL) framework based on graph attention network (GAT) architecture was developed to construct models for screening PBMT chemicals, designating the PBT chemical inventories as source domains and the PBMT chemical data set as target domains. A weighted loss () function was proposed and proved to mitigate the negative impact of imbalanced data on the model performance. Results indicate the TL-GAT models outperformed GAT models, along with large coverage of applicability domains and interpretability. The constructed models were employed to identify PBMT chemicals from inventories consisting of about 1 × 10 chemicals. The developed TL-GAT framework with the function holds broad applicability across diverse tasks, especially those involving small and imbalanced data sets.

摘要

筛选有害化学物质的方法对于合理管理至关重要。持久性、生物累积性、流动性和毒性(PBMT)化学物质在环境中持久存在,在水生环境中具有高流动性,对人类和生态健康构成风险。然而,大量化学物质缺乏实验数据阻碍了PBMT化学物质的识别。通过广泛搜索实测化学物质流动性数据以及持久性、生物累积性和毒性(PBT)化学物质清单,本研究构建了关于PBMT化学物质的综合数据集。为了解决PBMT化学物质数据集数量有限的问题,开发了一种基于图注意力网络(GAT)架构的迁移学习(TL)框架,以构建筛选PBMT化学物质的模型,将PBT化学物质清单指定为源域,将PBMT化学物质数据集指定为目标域。提出并证明了一种加权损失()函数,以减轻数据不平衡对模型性能的负面影响。结果表明,TL-GAT模型优于GAT模型,同时具有较大的适用域覆盖范围和可解释性。所构建的模型用于从约1×10种化学物质的清单中识别PBMT化学物质。所开发的带有函数的TL-GAT框架在各种任务中具有广泛的适用性,尤其是那些涉及小数据集和不平衡数据集的任务。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验