Zhang Duo, Liu Xinzijian, Zhang Xiangyu, Zhang Chengqian, Cai Chun, Bi Hangrui, Du Yiming, Qin Xuejian, Peng Anyang, Huang Jiameng, Li Bowen, Shan Yifan, Zeng Jinzhe, Zhang Yuzhi, Liu Siyuan, Li Yifan, Chang Junhan, Wang Xinyan, Zhou Shuo, Liu Jianchuan, Luo Xiaoshan, Wang Zhenyu, Jiang Wanrun, Wu Jing, Yang Yudi, Yang Jiyuan, Yang Manyi, Gong Fu-Qiang, Zhang Linshuang, Shi Mengchao, Dai Fu-Zhi, York Darrin M, Liu Shi, Zhu Tong, Zhong Zhicheng, Lv Jian, Cheng Jun, Jia Weile, Chen Mohan, Ke Guolin, Weinan E, Zhang Linfeng, Wang Han
AI for Science Institute, Beijing 100080, P. R. China.
DP Technology, Beijing 100080, P. R. China.
NPJ Comput Mater. 2024;10(1). doi: 10.1038/s41524-024-01493-2. Epub 2024 Dec 19.
The rapid advancements in artificial intelligence (AI) are catalyzing transformative changes in atomic modeling, simulation, and design. AI-driven potential energy models have demonstrated the capability to conduct large-scale, long-duration simulations with the accuracy of electronic structure methods. However, the model generation process remains a bottleneck for large-scale applications. We propose a shift towards a model-centric ecosystem, wherein a large atomic model (LAM), pre-trained across multiple disciplines, can be efficiently fine-tuned and distilled for various downstream tasks, thereby establishing a new framework for molecular modeling. In this study, we introduce the DPA-2 architecture as a prototype for LAMs. Pre-trained on a diverse array of chemical and materials systems using a multi-task approach, DPA-2 demonstrates superior generalization capabilities across multiple downstream tasks compared to the traditional single-task pre-training and fine-tuning methodologies. Our approach sets the stage for the development and broad application of LAMs in molecular and materials simulation research.
人工智能(AI)的快速发展正在催化原子建模、模拟和设计方面的变革性变化。人工智能驱动的势能模型已展示出能够以电子结构方法的精度进行大规模、长时间模拟的能力。然而,模型生成过程仍然是大规模应用的瓶颈。我们建议转向以模型为中心的生态系统,在该系统中,跨多学科预训练的大型原子模型(LAM)可以针对各种下游任务进行高效微调与提炼,从而建立一个分子建模的新框架。在本研究中,我们引入DPA - 2架构作为LAMs的原型。DPA - 2使用多任务方法在各种化学和材料系统上进行预训练,与传统的单任务预训练和微调方法相比,它在多个下游任务中展现出卓越的泛化能力。我们的方法为LAMs在分子和材料模拟研究中的开发与广泛应用奠定了基础。