Lin Xiaohan, Xia Yijie, Li Yanheng, Huang Yu-Peng, Liu Shuo, Zhang Jun, Gao Yi Qin
New Cornerstone Science Laboratory, Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
School of Pharmacy, Lanzhou University, Lanzhou, China.
Nat Commun. 2025 Jul 1;16(1):6043. doi: 10.1038/s41467-025-61323-x.
Generating molecular structures towards desired properties is a critical task in computer-aided drug and material design. As special 3D entities, molecules inherit non-trivial physical complexity, and many intrinsic properties may not be learnable through pure data-driven approaches, hindering the transaction of powerful generative artificial intelligence (GenAI) to this field. To avoid existing molecular GenAI's heavy reliance on domain-specific models and priors, in this research, we derive theoretical guidelines to bridge the methodological gap between GenAI for images and molecules, allowing pre-training of foundation models for 3D molecular generation. Difficulties due to symmetry, stability and entropy, which are critical for molecules, are overcome through a simple and model-agnostic training protocol. Moreover, we apply physics-informed strategies to force MolEdit, a pre-trained multimodal molecular GenAI, to obey physics laws and align with contextual preferences, and thus suppress undesired model hallucinations. MolEdit can generate valid molecules with comprehensive symmetry, strikes a better balance between configuration stability and conformer diversity, and supports complicated 3D scaffolds which frustrate other methods. Furthermore, MolEdit is applicable for zero-shot lead optimization and linker design following contextual and geometrical specifications. Collectively, as a foundation model, MolEdit offers flexibility and developability for AI-aided editing and manipulation of molecules serving various purposes.
生成具有所需特性的分子结构是计算机辅助药物和材料设计中的一项关键任务。作为特殊的三维实体,分子具有不平凡的物理复杂性,许多内在特性可能无法通过纯数据驱动的方法来学习,这阻碍了强大的生成式人工智能(GenAI)在该领域的应用。为了避免现有的分子GenAI严重依赖特定领域的模型和先验知识,在本研究中,我们推导了理论指导方针,以弥合图像和分子的GenAI之间的方法差距,从而实现用于三维分子生成的基础模型的预训练。通过一个简单且与模型无关的训练协议,克服了分子所特有的对称性、稳定性和熵带来的困难。此外,我们应用物理知识指导的策略,迫使预训练的多模态分子GenAI——MolEdit遵守物理定律并符合上下文偏好,从而抑制不期望的模型幻觉。MolEdit可以生成具有全面对称性的有效分子,在构型稳定性和构象多样性之间取得更好的平衡,并支持使其他方法受挫的复杂三维支架结构。此外,MolEdit适用于根据上下文和几何规范进行零样本先导优化和连接子设计。总的来说,作为一个基础模型,MolEdit为人工智能辅助的各种分子编辑和操作提供了灵活性和可开发性。
Acc Chem Res. 2025-6-17
Cochrane Database Syst Rev. 2021-10-27
Cochrane Database Syst Rev. 2025-2-19
J Psychiatr Ment Health Nurs. 2024-8
Arch Ital Urol Androl. 2025-6-30
Nat Commun. 2024-9-27
Science. 2024-8-23
Nat Commun. 2024-8-6
J Chem Inf Model. 2024-8-12
Science. 2024-6-21