Yang Xin, Sun Jianqiang, Jin Bingyu, Lu Yuer, Cheng Jinyan, Jiang Jiaju, Zhao Qi, Shuai Jianwei
School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan 114051, China; Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou 325001, China.
School of Information Science and Engineering, Linyi University, Linyi 276000, China.
J Adv Res. 2025 Feb;68:477-489. doi: 10.1016/j.jare.2024.06.002. Epub 2024 Jun 4.
With the escalating menace of organic compounds in environmental pollution imperiling the survival of aquatic organisms, the investigation of organic compound toxicity across diverse aquatic species assumes paramount significance for environmental protection. Understanding how different species respond to these compounds helps assess the potential ecological impact of pollution on aquatic ecosystems as a whole. Compared with traditional experimental methods, deep learning methods have higher accuracy in predicting aquatic toxicity, faster data processing speed and better generalization ability.
This article presents ATFPGT-multi, an advanced multi-task deep neural network prediction model for organic toxicity.
The model integrates molecular fingerprints and molecule graphs to characterize molecules, enabling the simultaneous prediction of acute toxicity for the same organic compound across four distinct fish species. Furthermore, to validate the advantages of multi-task learning, we independently construct prediction models, named ATFPGT-single, for each fish species. We employ cross-validation in our experiments to assess the performance and generalization ability of ATFPGT-multi.
The experimental results indicate, first, that ATFPGT-multi outperforms ATFPGT-single on four fish datasets with AUC improvements of 9.8%, 4%, 4.8%, and 8.2%, respectively, demonstrating the superiority of multi-task learning over single-task learning. Furthermore, in comparison with previous algorithms, ATFPGT-multi outperforms comparative methods, emphasizing that our approach exhibits higher accuracy and reliability in predicting aquatic toxicity. Moreover, ATFPGT-multi utilizes attention scores to identify molecular fragments associated with fish toxicity in organic molecules, as demonstrated by two organic molecule examples in the main text, demonstrating the interpretability of ATFPGT-multi.
In summary, ATFPGT-multi provides important support and reference for the further development of aquatic toxicity assessment. All of codes and datasets are freely available online at https://github.com/zhaoqi106/ATFPGT-multi.
随着环境污染中有机化合物的威胁不断升级,危及水生生物的生存,研究不同水生物种对有机化合物的毒性对于环境保护至关重要。了解不同物种对这些化合物的反应有助于评估污染对整个水生生态系统的潜在生态影响。与传统实验方法相比,深度学习方法在预测水生毒性方面具有更高的准确性、更快的数据处理速度和更好的泛化能力。
本文提出了ATFPGT-multi,一种用于有机毒性的先进多任务深度神经网络预测模型。
该模型整合分子指纹和分子图来表征分子,能够同时预测同一有机化合物对四种不同鱼类的急性毒性。此外,为了验证多任务学习的优势,我们为每种鱼类独立构建了预测模型,称为ATFPGT-single。我们在实验中采用交叉验证来评估ATFPGT-multi的性能和泛化能力。
实验结果表明,首先,ATFPGT-multi在四个鱼类数据集上优于ATFPGT-single,AUC分别提高了9.8%、4%、4.8%和8.2%,证明了多任务学习优于单任务学习。此外,与先前的算法相比,ATFPGT-multi优于比较方法,强调我们的方法在预测水生毒性方面具有更高的准确性和可靠性。此外,ATFPGT-multi利用注意力分数来识别有机分子中与鱼类毒性相关的分子片段,如正文的两个有机分子示例所示,证明了ATFPGT-multi的可解释性。
总之,ATFPGT-multi为水生毒性评估的进一步发展提供了重要支持和参考。所有代码和数据集均可在https://github.com/zhaoqi106/ATFPGT-multi上免费在线获取。