Guo Gabe, Saidi Tristan Luca, Terban Maxwell W, Valsecchi Michele, Billinge Simon J L, Lipson Hod
Columbia University, Department of Computer Science, New York, NY, USA.
Stanford University, Department of Computer Science, Stanford, CA, USA.
Nat Mater. 2025 Apr 28. doi: 10.1038/s41563-025-02220-y.
A major challenge in materials science is the determination of the structure of nanometre-sized objects. Here we present an approach that uses a generative machine learning model based on diffusion processes that are trained on 45,229 known structures. The model factors measured the diffraction pattern as well as the relevant statistical priors on the unit cell of atomic cluster structures. Conditioned only on the chemical formula and the information-scarce finite-sized broadened powder diffraction pattern, we find that our model, PXRDnet, can successfully solve the simulated nanocrystals as small as 10 Å across 200 materials of varying symmetries and complexities, including structures from all seven crystal systems. We show that our model can successfully and verifiably determine structural candidates four out of five times, with an average error among these candidates being only 7% (as measured by the post-Rietveld refinement R-factor). Furthermore, PXRDnet is capable of solving structures from noisy diffraction patterns gathered in real-world experiments. We suggest that data-driven approaches, bootstrapped from theoretical simulation, will ultimately provide a path towards determining the structure of previously unsolved nanomaterials.
材料科学中的一个主要挑战是确定纳米级物体的结构。在此,我们提出一种方法,该方法使用基于扩散过程的生成式机器学习模型,该模型在45229个已知结构上进行训练。该模型因素测量了衍射图案以及原子团簇结构晶胞上的相关统计先验信息。仅以化学式和信息稀缺的有限尺寸加宽粉末衍射图案为条件,我们发现我们的模型PXRDnet能够成功解决跨越200种具有不同对称性和复杂性的材料(包括来自所有七个晶体系统的结构)中低至10埃的模拟纳米晶体的结构问题。我们表明,我们的模型能够在五分之四次的情况下成功且可验证地确定结构候选物,这些候选物之间的平均误差仅为7%(通过Rietveld精修后R因子测量)。此外,PXRDnet能够从实际实验中收集的噪声衍射图案中解析结构。我们认为,从理论模拟中自举的数据驱动方法最终将为确定以前未解决的纳米材料的结构提供一条途径。