Suppr超能文献

MPCD:一种通过整合通用知识和领域知识进行分子性质预测的多任务图变换器

MPCD: A Multitask Graph Transformer for Molecular Property Prediction by Integrating Common and Domain Knowledge.

作者信息

Yang Xixi, Duan Yanjing, Cheng Zhixiang, Li Kun, Liu Yuansheng, Zeng Xiangxiang, Cao Dongsheng

机构信息

College of Computer Science and Electronic Engineering, Hunan University, Changsha 410086, Hunan, China.

Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China.

出版信息

J Med Chem. 2024 Dec 12;67(23):21303-21316. doi: 10.1021/acs.jmedchem.4c02193. Epub 2024 Dec 2.

Abstract

Molecular property prediction with deep learning often employs self-supervised learning techniques to learn common knowledge through masked atom prediction. However, the common knowledge gained by masked atom prediction dramatically differs from the graph-level optimization objective of downstream tasks, which results in suboptimal problems. Particularly for properties with limited data, the failure to consider domain knowledge results in a direct search in an immense common space, rendering it infeasible to identify the global optimum. To address this, we propose MPCD, which enhances pretraining transferability by aligning the optimization objectives between pretraining and fine-tuning with domain knowledge. MPCD also leverages multitask learning to improve data utilization and model robustness. Technically, MPCD employs a relation-aware self-attention mechanism to capture molecules' local and global structures comprehensively. Extensive validation demonstrates that MPCD outperforms state-of-the-art methods for absorption, distribution, metabolism, excretion, and toxicity (ADMET) and physicochemical prediction across various data sizes.

摘要

利用深度学习进行分子性质预测通常采用自监督学习技术,通过掩码原子预测来学习通用知识。然而,通过掩码原子预测获得的通用知识与下游任务的图级优化目标有很大差异,这导致了次优问题。特别是对于数据有限的性质,由于未能考虑领域知识,导致在巨大的通用空间中进行直接搜索,从而难以确定全局最优解。为了解决这个问题,我们提出了MPCD,它通过将预训练和微调之间的优化目标与领域知识对齐来提高预训练的可迁移性。MPCD还利用多任务学习来提高数据利用率和模型鲁棒性。从技术上讲,MPCD采用了关系感知自注意力机制,以全面捕捉分子的局部和全局结构。广泛的验证表明,MPCD在各种数据规模下的吸收、分布、代谢、排泄和毒性(ADMET)以及物理化学预测方面优于现有方法。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验