Suppr超能文献

基于自回归大语言模型的晶体结构生成

Crystal structure generation with autoregressive large language modeling.

作者信息

Antunes Luis M, Butler Keith T, Grau-Crespo Ricardo

机构信息

Department of Chemistry, University of Reading, Whiteknights, Reading, UK.

Department of Chemistry, University College London, London, UK.

出版信息

Nat Commun. 2024 Dec 6;15(1):10570. doi: 10.1038/s41467-024-54639-7.

Abstract

The generation of plausible crystal structures is often the first step in predicting the structure and properties of a material from its chemical composition. However, most current methods for crystal structure prediction are computationally expensive, slowing the pace of innovation. Seeding structure prediction algorithms with quality generated candidates can overcome a major bottleneck. Here, we introduce CrystaLLM, a methodology for the versatile generation of crystal structures, based on the autoregressive large language modeling (LLM) of the Crystallographic Information File (CIF) format. Trained on millions of CIF files, CrystaLLM focuses on modeling crystal structures through text. CrystaLLM can produce plausible crystal structures for a wide range of inorganic compounds unseen in training, as demonstrated by ab initio simulations. Our approach challenges conventional representations of crystals, and demonstrates the potential of LLMs for learning effective models of crystal chemistry, which will lead to accelerated discovery and innovation in materials science.

摘要

从化学成分预测材料的结构和性质时,生成合理的晶体结构通常是第一步。然而,当前大多数晶体结构预测方法计算成本高昂,减缓了创新步伐。用高质量生成的候选结构为结构预测算法提供种子可以克服一个主要瓶颈。在此,我们介绍CrystaLLM,这是一种基于晶体学信息文件(CIF)格式的自回归大语言建模(LLM)来通用生成晶体结构的方法。在数百万个CIF文件上进行训练后,CrystaLLM专注于通过文本对晶体结构进行建模。如从头算模拟所示,CrystaLLM可以为训练中未见过的多种无机化合物生成合理的晶体结构。我们的方法挑战了晶体的传统表示方式,并展示了大语言模型在学习有效的晶体化学模型方面的潜力,这将加速材料科学中的发现和创新。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b08/11624194/7ab919b86f28/41467_2024_54639_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验