Suppr超能文献

晶体成分变换器:用于材料生成与改进设计的自学习神经语言模型

Crystal Composition Transformer: Self-Learning Neural Language Model for Generative and Tinkering Design of Materials.

作者信息

Wei Lai, Li Qinyang, Song Yuqi, Stefanov Stanislav, Dong Rongzhi, Fu Nihang, Siriwardane Edirisuriya M D, Chen Fanglin, Hu Jianjun

机构信息

Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, 29201, USA.

Department of Computer Science, University of Southern Maine, Portland, ME, 04131, USA.

出版信息

Adv Sci (Weinh). 2024 Sep;11(36):e2304305. doi: 10.1002/advs.202304305. Epub 2024 Aug 5.

Abstract

Self-supervised neural language models have recently achieved unprecedented success from natural language processing to learning the languages of biological sequences and organic molecules. These models have demonstrated superior performance in the generation, structure classification, and functional predictions for proteins and molecules with learned representations. However, most of the masking-based pre-trained language models are not designed for generative design, and their black-box nature makes it difficult to interpret their design logic. Here a Blank-filling Language Model for Materials (BLMM) Crystal Transformer is proposed, a neural network-based probabilistic generative model for generative and tinkering design of inorganic materials. The model is built on the blank-filling language model for text generation and has demonstrated unique advantages in learning the "materials grammars" together with high-quality generation, interpretability, and data efficiency. It can generate chemically valid materials compositions with as high as 89.7% charge neutrality and 84.8% balanced electronegativity, which are more than four and eight times higher compared to a pseudo-random sampling baseline. The probabilistic generation process of BLMM allows it to recommend materials tinkering operations based on learned materials chemistry, which makes it useful for materials doping. The model is applied to discover a set of new materials as validated using the Density Functional Theory (DFT) calculations. This work thus brings the unsupervised transformer language models based generative artificial intelligence to inorganic materials. A user-friendly web app for tinkering materials design has been developed and can be accessed freely at www.materialsatlas.org/blmtinker.

摘要

自监督神经语言模型最近在从自然语言处理到学习生物序列和有机分子语言等方面取得了前所未有的成功。这些模型在蛋白质和分子的生成、结构分类以及基于学习表示的功能预测方面展现出卓越性能。然而,大多数基于掩码的预训练语言模型并非为生成式设计而构建,其黑箱性质使其设计逻辑难以解释。在此,我们提出了一种用于材料的填空语言模型(BLMM)晶体变换器,这是一种基于神经网络的概率生成模型,用于无机材料的生成式和修补设计。该模型基于用于文本生成的填空语言模型构建,在学习“材料语法”以及高质量生成、可解释性和数据效率方面展现出独特优势。它能够生成化学上有效的材料成分,电荷中性高达89.7%,电负性平衡度达84.8%,与伪随机抽样基线相比分别高出四倍和八倍以上。BLMM的概率生成过程使其能够基于所学的材料化学知识推荐材料修补操作,这使其在材料掺杂方面很有用。该模型应用于发现一组经密度泛函理论(DFT)计算验证的新材料。这项工作因此将基于无监督变换器语言模型的生成式人工智能引入了无机材料领域。已开发出一个用于材料修补设计的用户友好型网络应用程序,可在www.materialsatlas.org/blmtinker上免费访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/490c/11423232/d50292ed4485/ADVS-11-2304305-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验