Suppr超能文献

通过面向任务的迁移学习提高分子性质预测能力:整合通用结构见解和领域特定知识。

Enhancing Molecular Property Prediction through Task-Oriented Transfer Learning: Integrating Universal Structural Insights and Domain-Specific Knowledge.

机构信息

Xiangya School of Pharmaceutical Sciences, Central South University, Changsha Hunan 410013, P. R. China.

College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan 410013, P. R. China.

出版信息

J Med Chem. 2024 Jun 13;67(11):9575-9586. doi: 10.1021/acs.jmedchem.4c00692. Epub 2024 May 15.

Abstract

Precisely predicting molecular properties is crucial in drug discovery, but the scarcity of labeled data poses a challenge for applying deep learning methods. While large-scale self-supervised pretraining has proven an effective solution, it often neglects domain-specific knowledge. To tackle this issue, we introduce Task-Oriented Multilevel Learning based on BERT (TOML-BERT), a dual-level pretraining framework that considers both structural patterns and domain knowledge of molecules. TOML-BERT achieved state-of-the-art prediction performance on 10 pharmaceutical datasets. It has the capability to mine contextual information within molecular structures and extract domain knowledge from massive pseudo-labeled data. The dual-level pretraining accomplished significant positive transfer, with its two components making complementary contributions. Interpretive analysis elucidated that the effectiveness of the dual-level pretraining lies in the prior learning of a task-related molecular representation. Overall, TOML-BERT demonstrates the potential of combining multiple pretraining tasks to extract task-oriented knowledge, advancing molecular property prediction in drug discovery.

摘要

精确预测分子性质在药物发现中至关重要,但标记数据的稀缺性给应用深度学习方法带来了挑战。虽然大规模的自监督预训练已被证明是一种有效的解决方案,但它往往忽略了领域特定的知识。为了解决这个问题,我们引入了基于 BERT 的面向任务的多层次学习(TOML-BERT),这是一个双重层次的预训练框架,考虑了分子的结构模式和领域知识。TOML-BERT 在 10 个制药数据集上实现了最先进的预测性能。它具有挖掘分子结构内的上下文信息和从大量伪标记数据中提取领域知识的能力。双重层次的预训练实现了显著的正迁移,其两个组成部分做出了互补的贡献。解释性分析表明,双重层次预训练的有效性在于与任务相关的分子表示的先验学习。总的来说,TOML-BERT 展示了结合多个预训练任务来提取面向任务的知识的潜力,从而推进了药物发现中的分子性质预测。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验