分子性质预测中的变压器：过去五年的经验教训。

Transformers for Molecular Property Prediction: Lessons Learned from the Past Five Years.

机构信息

Data Driven Drug Design, Center for Bioinformatics, Saarland University, Saarbrücken 66123, Germany.

BASF SE, Ludwigshafen 67056, Germany.

出版信息

J Chem Inf Model. 2024 Aug 26;64(16):6259-6280. doi: 10.1021/acs.jcim.4c00747. Epub 2024 Aug 13.

DOI:10.1021/acs.jcim.4c00747

PMID:39136669

Abstract

Molecular Property Prediction (MPP) is vital for drug discovery, crop protection, and environmental science. Over the last decades, diverse computational techniques have been developed, from using simple physical and chemical properties and molecular fingerprints in statistical models and classical machine learning to advanced deep learning approaches. In this review, we aim to distill insights from current research on employing transformer models for MPP. We analyze the currently available models and explore key questions that arise when training and fine-tuning a transformer model for MPP. These questions encompass the choice and scale of the pretraining data, optimal architecture selections, and promising pretraining objectives. Our analysis highlights areas not yet covered in current research, inviting further exploration to enhance the field's understanding. Additionally, we address the challenges in comparing different models, emphasizing the need for standardized data splitting and robust statistical analysis.

摘要

分子性质预测（MPP）在药物发现、作物保护和环境科学中至关重要。在过去的几十年中，已经开发出了多种计算技术，从使用简单的物理和化学性质以及分子指纹在统计模型和经典机器学习中的应用，到先进的深度学习方法。在这篇综述中，我们旨在从当前使用转换器模型进行 MPP 的研究中提取见解。我们分析了现有的模型，并探讨了在为 MPP 训练和微调转换器模型时出现的关键问题。这些问题包括预训练数据的选择和规模、最佳架构选择以及有前途的预训练目标。我们的分析突出了当前研究中尚未涵盖的领域，邀请进一步探索以增强该领域的理解。此外，我们还解决了比较不同模型的挑战，强调需要标准化的数据分割和稳健的统计分析。

相似文献

Transformers for Molecular Property Prediction: Lessons Learned from the Past Five Years.分子性质预测中的变压器：过去五年的经验教训。

J Chem Inf Model. 2024 Aug 26;64(16):6259-6280. doi: 10.1021/acs.jcim.4c00747. Epub 2024 Aug 13.

KnoMol: A Knowledge-Enhanced Graph Transformer for Molecular Property Prediction.KnoMol：一种用于分子性质预测的知识增强图Transformer。

J Chem Inf Model. 2024 Oct 14;64(19):7337-7348. doi: 10.1021/acs.jcim.4c01092. Epub 2024 Sep 25.

Transfer Learning: Making Retrosynthetic Predictions Based on a Small Chemical Reaction Dataset Scale to a New Level.迁移学习：基于小规模化学反应数据集的逆向合成预测扩展到新的水平。

Molecules. 2020 May 19;25(10):2357. doi: 10.3390/molecules25102357.

Quantum-Informed Molecular Representation Learning Enhancing ADMET Property Prediction.量子启发的分子表示学习增强 ADMET 性质预测。

J Chem Inf Model. 2024 Jul 8;64(13):5028-5040. doi: 10.1021/acs.jcim.4c00772. Epub 2024 Jun 25.

Large-Scale Distributed Training of Transformers for Chemical Fingerprinting.用于化学指纹识别的 Transformer 的大规模分布式训练。

J Chem Inf Model. 2022 Oct 24;62(20):4852-4862. doi: 10.1021/acs.jcim.2c00715. Epub 2022 Oct 4.

Multimodal Transformer for Property Prediction in Polymers.用于聚合物性能预测的多模态变压器

ACS Appl Mater Interfaces. 2024 Apr 3;16(13):16853-16860. doi: 10.1021/acsami.4c01207. Epub 2024 Mar 19.

Algebraic graph-assisted bidirectional transformers for molecular property prediction.基于代数图辅助的双向转换器在分子性质预测中的应用。

Nat Commun. 2021 Jun 10;12(1):3521. doi: 10.1038/s41467-021-23720-w.

Pushing the Boundaries of Molecular Property Prediction for Drug Discovery with Multitask Learning BERT Enhanced by SMILES Enumeration.通过SMILES枚举增强的多任务学习BERT推动药物发现中分子性质预测的边界

Research (Wash D C). 2022 Dec 15;2022:0004. doi: 10.34133/research.0004. eCollection 2022.

A merged molecular representation learning for molecular properties prediction with a web-based service.基于网络服务的分子性质预测的融合分子表示学习。

Sci Rep. 2021 May 26;11(1):11028. doi: 10.1038/s41598-021-90259-7.

Do it the transformer way: A comprehensive review of brain and vision transformers for autism spectrum disorder diagnosis and classification.采用变压器方法：自闭症谱系障碍诊断和分类的脑和视觉变压器的全面综述。

Comput Biol Med. 2023 Dec;167:107667. doi: 10.1016/j.compbiomed.2023.107667. Epub 2023 Nov 3.

引用本文的文献

DFusMol: predicting molecular properties based on dual-channel attention.DFusMol：基于双通道注意力预测分子性质。

Front Mol Biosci. 2025 Jul 30;12:1623620. doi: 10.3389/fmolb.2025.1623620. eCollection 2025.

Knowledge Distillation for Molecular Property Prediction: A Scalability Analysis.用于分子性质预测的知识蒸馏：可扩展性分析

Adv Sci (Weinh). 2025 Jun;12(22):e2503271. doi: 10.1002/advs.202503271. Epub 2025 Apr 9.

Deep Learning for Odor Prediction on Aroma-Chemical Blends.基于香气化学混合物的气味预测深度学习

ACS Omega. 2025 Mar 3;10(9):8980-8992. doi: 10.1021/acsomega.4c07078. eCollection 2025 Mar 11.

Positional embeddings and zero-shot learning using BERT for molecular-property prediction.使用BERT进行位置嵌入和零样本学习以预测分子性质

J Cheminform. 2025 Feb 5;17(1):17. doi: 10.1186/s13321-025-00959-9.

A review of large language models and autonomous agents in chemistry.化学领域中大型语言模型与自主智能体的综述。

Chem Sci. 2024 Dec 9;16(6):2514-2572. doi: 10.1039/d4sc03921a. eCollection 2025 Feb 5.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

分子性质预测中的变压器：过去五年的经验教训。

Transformers for Molecular Property Prediction: Lessons Learned from the Past Five Years.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献