Park Hyunsoo, Onwuli Anthony, Walsh Aron
Department of Materials, Imperial College London, London, UK.
Nat Commun. 2025 May 12;16(1):4379. doi: 10.1038/s41467-025-59636-y.
The vastness of chemical space presents a long-standing challenge for the exploration of new compounds with pre-determined properties. In materials science, crystal structure prediction has become a mature tool for mapping from composition to structure based on global optimisation techniques. Generative artificial intelligence now offers the means to efficiently navigate larger regions of crystal chemical space informed by structure-property datasets of materials. Here, we introduce a model, named Chemeleon, designed to generate chemical compositions and crystal structures by learning from both textual descriptions and three-dimensional structural data. The model employs denoising diffusion techniques for compound generation using textual inputs aligned with structural data via cross-modal contrastive learning. The potential of this approach is demonstrated for multi-component compound generation, including the Zn-Ti-O ternary space, and the prediction of stable phases in the Li-P-S-Cl quaternary space of relevance to solid-state batteries.
化学空间的广阔性给探索具有预定性质的新化合物带来了长期挑战。在材料科学中,晶体结构预测已成为一种基于全局优化技术从组成映射到结构的成熟工具。生成式人工智能现在提供了一种手段,可根据材料的结构-性质数据集在更大的晶体化学空间区域中高效导航。在此,我们介绍一种名为Chemeleon的模型,该模型旨在通过从文本描述和三维结构数据中学习来生成化学成分和晶体结构。该模型采用去噪扩散技术,通过跨模态对比学习将文本输入与结构数据对齐来生成化合物。这种方法在多组分化合物生成(包括Zn-Ti-O三元空间)以及与固态电池相关的Li-P-S-Cl四元空间中稳定相的预测方面展示了潜力。