Luo Shulin, Xing Bangyu, Faizan Muhammad, Xie Jiahao, Zhou Kun, Zhao Ruoting, Li Tianshu, Wang Xinjiang, Fu Yuhao, He Xin, Lv Jian, Zhang Lijun
State Key Laboratory of Integrated Optoelectronics, Key Laboratory of Automobile Materials of MOE, School of Materials Science and Engineering, and Jilin Provincial International Cooperation Key Laboratory of High-Efficiency Clean Energy Materials, Jilin University, Changchun 130012, China.
State Key Laboratory of Superhard Materials, College of Physics, Jilin University, Changchun 130012, China.
J Phys Chem A. 2022 Jul 7;126(26):4300-4312. doi: 10.1021/acs.jpca.2c03416. Epub 2022 Jun 22.
Recognition of structure prototypes from tremendous known inorganic crystal structures has been an important subject beneficial for materials science research and new materials design. The existing databases of inorganic crystal structure prototypes were mostly constructed by classifying materials in terms of the crystallographic space group information. Herein, we employed a distinct strategy to construct the inorganic crystal structure prototype database, relying on the classification of materials in terms of local atomic environments (LAEs) accompanied by unsupervised machine learning method. Specifically, we adopted a hierarchical clustering approach onto all experimentally known inorganic crystal structure data to identify structure prototypes. The criterion for hierarchical clustering is the LAE represented by the state-of-the-art structure fingerprints of the improved bond-orientational order parameters and the smooth overlap of atomic positions. This allows us to build up a LAE-based Inorganic Crystal Structure Prototype Database (LAE-ICSPD) containing 15,613 structure prototypes with defined stoichiometries. In addition, we have developed a Structure Prototype Generator Infrastructure (SPGI) package, which is a useful toolkit for structure prototype generation. Our developed SPGI toolkit and LAE-ICSPD are beneficial for investigating inorganic materials in a global way as well as accelerating the materials discovery process in the data-driven mode.
从大量已知的无机晶体结构中识别结构原型一直是一个重要课题,对材料科学研究和新材料设计有益。现有的无机晶体结构原型数据库大多是根据晶体学空间群信息对材料进行分类构建的。在此,我们采用了一种独特的策略来构建无机晶体结构原型数据库,该策略依赖于根据局部原子环境(LAEs)对材料进行分类,并结合无监督机器学习方法。具体而言,我们对所有实验已知的无机晶体结构数据采用层次聚类方法来识别结构原型。层次聚类的标准是由改进的键取向序参数的最新结构指纹和原子位置的平滑重叠所表示的LAE。这使我们能够建立一个基于LAE的无机晶体结构原型数据库(LAE-ICSPD),其中包含15,613个具有确定化学计量比的结构原型。此外,我们还开发了一个结构原型生成器基础设施(SPGI)软件包,它是一个用于生成结构原型的有用工具包。我们开发的SPGI工具包和LAE-ICSPD有利于以全局方式研究无机材料,并加速数据驱动模式下的材料发现过程。