• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

材料化学中用于多项实验验证研究的机器学习描述符:奥利尼克元素性质数据集。

Machine learning descriptors in materials chemistry used in multiple experimentally validated studies: Oliynyk elemental property dataset.

作者信息

Lee Sangjoon, Chen Clio, Garcia Griheydi, Oliynyk Anton

机构信息

Department of Applied Physics and Applied Mathematics, Columbia University, New York, NY 10027, United States.

Department of Chemistry and Biochemistry, Manhattan College, Riverdale, NY 10471, United States.

出版信息

Data Brief. 2024 Feb 9;53:110178. doi: 10.1016/j.dib.2024.110178. eCollection 2024 Apr.

DOI:10.1016/j.dib.2024.110178
PMID:38384308
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10879764/
Abstract

Materials informatics employs data-driven approaches for analysis and discovery of materials. Features also referred to as descriptors are essential in generating reliable and accurate machine-learning models. While general data can be obtained through public and commercial sources, features must be tailored to specific applications. Common featurizers suitable for generic chemical problems may not be effective in features-property mapping in solid-state materials with ML models. Here, we have assembled the Oliynyk property list for compositional feature generation, which performs well on limited datasets (50 to 1000 training data points) in the solid-state materials domain. The dataset contains 98 elemental features for atomic numbers from 1 to 92, including thermodynamic properties, electronic structure data, size, electronegativity, and bulk properties such as melting point, density, and conductivity. The dataset has been utilized peer-reviewed publications in predicting material hardness, classification, discovery of novel Heusler compounds, band gap prediction, and determining the site preference of atoms using machine learning models including support vector machines, random forests for classification, and support vector regression for regression problems. We have compiled the dataset by parsing data from publicly available databases and literature and further supplementing it by interpolating values with Gaussian process regression.

摘要

材料信息学采用数据驱动的方法来分析和发现材料。特征(也称为描述符)对于生成可靠且准确的机器学习模型至关重要。虽然一般数据可以通过公共和商业来源获得,但特征必须针对特定应用进行定制。适用于一般化学问题的常见特征提取器在使用机器学习模型进行固态材料的特征-性质映射时可能无效。在此,我们组装了用于生成组成特征的奥利尼克性质列表,它在固态材料领域的有限数据集(50至1000个训练数据点)上表现良好。该数据集包含从1到92号原子序数的98个元素特征,包括热力学性质、电子结构数据、尺寸、电负性以及诸如熔点、密度和电导率等体相性质。该数据集已被用于同行评审的出版物中,通过包括支持向量机、用于分类的随机森林以及用于回归问题的支持向量回归等机器学习模型来预测材料硬度、分类、发现新型赫斯勒化合物、带隙预测以及确定原子的位置偏好。我们通过解析来自公开可用数据库和文献的数据,并通过高斯过程回归对值进行插值来进一步补充,从而汇编了该数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6f/10879764/e10f69147f8b/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6f/10879764/14229a28dcbf/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6f/10879764/9c46fc939636/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6f/10879764/4eff47263329/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6f/10879764/e10f69147f8b/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6f/10879764/14229a28dcbf/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6f/10879764/9c46fc939636/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6f/10879764/4eff47263329/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6f/10879764/e10f69147f8b/gr4.jpg

相似文献

1
Machine learning descriptors in materials chemistry used in multiple experimentally validated studies: Oliynyk elemental property dataset.材料化学中用于多项实验验证研究的机器学习描述符:奥利尼克元素性质数据集。
Data Brief. 2024 Feb 9;53:110178. doi: 10.1016/j.dib.2024.110178. eCollection 2024 Apr.
2
A general representation scheme for crystalline solids based on Voronoi-tessellation real feature values and atomic property data.一种基于Voronoi镶嵌实特征值和原子属性数据的晶体固体通用表示方案。
Sci Technol Adv Mater. 2018 Mar 19;19(1):231-242. doi: 10.1080/14686996.2018.1439253. eCollection 2018.
3
Feature Blending: An Approach toward Generalized Machine Learning Models for Property Prediction.特征融合:一种用于属性预测的广义机器学习模型的方法。
ACS Phys Chem Au. 2021 Sep 17;2(1):16-22. doi: 10.1021/acsphyschemau.1c00017. eCollection 2022 Jan 26.
4
Discovery of Intermetallic Compounds from Traditional to Machine-Learning Approaches.从传统方法到机器学习方法发现金属间化合物。
Acc Chem Res. 2018 Jan 16;51(1):59-68. doi: 10.1021/acs.accounts.7b00490. Epub 2017 Dec 15.
5
Data-driven machine learning model for the prediction of oxygen vacancy formation energy of metal oxide materials.用于预测金属氧化物材料氧空位形成能的数据驱动机器学习模型。
Phys Chem Chem Phys. 2021 Jul 28;23(29):15675-15684. doi: 10.1039/d1cp02066h.
6
Advancing material property prediction: using physics-informed machine learning models for viscosity.推进材料性能预测:使用物理信息机器学习模型预测粘度。
J Cheminform. 2024 Mar 14;16(1):31. doi: 10.1186/s13321-024-00820-5.
7
Essential structural and experimental descriptors for bulk and grain boundary conductivities of Li solid electrolytes.锂固体电解质体电导率和晶界电导率的基本结构及实验描述符。
Sci Technol Adv Mater. 2020 Oct 19;21(1):712-725. doi: 10.1080/14686996.2020.1824985.
8
Universal machine learning framework for defect predictions in zinc blende semiconductors.用于闪锌矿半导体缺陷预测的通用机器学习框架。
Patterns (N Y). 2022 Feb 14;3(3):100450. doi: 10.1016/j.patter.2022.100450. eCollection 2022 Mar 11.
9
Automated classification of tropical shrub species: a hybrid of leaf shape and machine learning approach.热带灌木物种的自动分类:叶形与机器学习方法的结合
PeerJ. 2017 Sep 12;5:e3792. doi: 10.7717/peerj.3792. eCollection 2017.
10
Descriptor engineering in machine learning regression of electronic structure properties for 2D materials.机器学习中 2D 材料电子结构性质回归的描述符工程。
Sci Rep. 2023 Apr 3;13(1):5426. doi: 10.1038/s41598-023-31928-7.

本文引用的文献

1
Covalent radii revisited.共价半径再探讨。
Dalton Trans. 2008 Jun 7(21):2832-8. doi: 10.1039/b801115j. Epub 2008 Apr 7.
2
A revised set of values of single-bond radii derived from the observed interatomic distances in metals by correction for bond number and resonance energy.经修正的一组单键半径值,该值由金属中原子间实际距离经键数和共振能修正后得到。
Proc Natl Acad Sci U S A. 1986 Jun;83(11):3569-71. doi: 10.1073/pnas.83.11.3569.