ToxACoL:一种用于急性毒性评估的、以终点为导向且专注于任务的化合物表示学习范式。
ToxACoL: an endpoint-aware and task-focused compound representation learning paradigm for acute toxicity assessment.
作者信息
Lu Jiang, Wu Lianlian, Li Ruijiang, Wan Mengxuan, Yang Jun, Zan Peng, Bai Hui, He Song, Bo Xiaochen
机构信息
Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin, People's Republic of China.
Department of Advanced & Interdisciplinary Biotechnology, Academy of Military Medical Sciences, Beijing, People's Republic of China.
出版信息
Nat Commun. 2025 Jul 1;16(1):5992. doi: 10.1038/s41467-025-60989-7.
Multi-species acute toxicity assessment forms the basis for chemical classification, labelling and risk management. Existing deep learning methods struggle with diverse experimental conditions, imbalanced data, and scarce target data, hindering their ability to reveal endpoint associations and accurately predict data-scarce endpoints. Here we propose a machine learning paradigm, Adjoint Correlation Learning, for multi-condition acute toxicity assessment (ToxACoL) to address these challenges. ToxACoL models endpoint associations via graph topology and achieves knowledge transfer via graph convolution. The adjoint correlation mechanism encodes compounds and endpoints synchronously, yielding endpoint-aware and task-focused representations. Comprehensive analyses demonstrate that ToxACoL yields 43%-87% improvements for data-scarce human endpoints, while reducing training data by 70% to 80%. Visualization of the learned top-level representation interprets structural alert mechanisms. Filled-in toxicity values highlight potential for extrapolating animal results to humans. Finally, we deploy ToxACoL as a free web platform for rapid prediction of multi-condition acute toxicities.
多物种急性毒性评估是化学品分类、标签和风险管理的基础。现有的深度学习方法在面对多样的实验条件、不平衡的数据和稀缺的目标数据时存在困难,这阻碍了它们揭示终点关联并准确预测数据稀缺终点的能力。在此,我们提出一种机器学习范式——伴随相关学习,用于多条件急性毒性评估(ToxACoL),以应对这些挑战。ToxACoL通过图拓扑对终点关联进行建模,并通过图卷积实现知识转移。伴随相关机制对化合物和终点进行同步编码,生成具有终点感知和任务聚焦的表示。综合分析表明,ToxACoL在数据稀缺的人类终点上有43% - 87%的改进,同时将训练数据减少了70%至80%。对学习到的顶级表示进行可视化可解释结构警示机制。填充的毒性值突出了将动物结果外推至人类的潜力。最后,我们将ToxACoL部署为一个免费的网络平台,用于快速预测多条件急性毒性。