• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于双层特征选择和Optuna-Stacking集成学习模型的超导体临界温度预测

Prediction of the Critical Temperature of Superconductors Based on Two-Layer Feature Selection and the Optuna-Stacking Ensemble Learning Model.

作者信息

Yu Jiahao, Zhao Yongman, Pan Rongshun, Zhou Xue, Wei Zikai

机构信息

School of Mechanical and Electrical Engineering, Shihezi University, Shihezi832003, China.

Key Laboratory of Modern Agricultural Machinery, Shihezi University, Shihezi832003, China.

出版信息

ACS Omega. 2023 Jan 13;8(3):3078-3090. doi: 10.1021/acsomega.2c06324. eCollection 2023 Jan 24.

DOI:10.1021/acsomega.2c06324
PMID:36713747
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9878623/
Abstract

The study of superconductors' critical temperature ( ) has been a matter of interest. A method combining a two-layer feature selection (TL) and Optuna-Stacking ensemble learning model is proposed in the study for predicting from physicochemical components. Since most machine-learning models require a large amount of prior knowledge to construct the feature vectors associated with manually, they may contain redundant or invalid features that adversely affect the analysis and prediction of . The TL model combines the advantages of filtered and packed feature selection. In the first layer, feature importance is ranked by "SHapley Additive explain (SHAP)" in combination with CatBoost, followed by maximum mutual information coefficient (MIC) and distance correlation coefficient (DCC) for initial feature selection in terms of feature importance ranking. The second layer uses a cross-validation-based genetic algorithm (cv-GA) to eliminate the remaining redundant/invalid features. The selected features are fed into the Stacking integrated learning model to achieve prediction of Tc, and the multidimensional hyperparametric optimization of the metamodel is achieved by Optuna, an improved Bayesian hyperparametric optimization framework based on the Tree-structured Parzen Estimator (TPE) and pruning strategy. The model has obvious advantages and generality in terms of prediction performance and feature reduction rate, and it also proves to be suitable for high-temperature superconductor prediction. It provides an efficient and cost-effective method for data-driven superconductor research.

摘要

对超导体临界温度( )的研究一直是一个备受关注的问题。该研究提出了一种将双层特征选择(TL)和Optuna-Stacking集成学习模型相结合的方法,用于从物理化学组分预测 。由于大多数机器学习模型需要大量先验知识来手动构建与 相关的特征向量,这些特征向量可能包含冗余或无效特征,从而对 的分析和预测产生不利影响。TL模型结合了过滤式和包装式特征选择的优点。在第一层,通过结合CatBoost的“SHapley加法解释(SHAP)”对特征重要性进行排序,然后根据特征重要性排序,使用最大互信息系数(MIC)和距离相关系数(DCC)进行初始特征选择。第二层使用基于交叉验证的遗传算法(cv-GA)来消除剩余的冗余/无效特征。将所选特征输入到Stacking集成学习模型中以实现对Tc的预测,并通过Optuna实现元模型的多维超参数优化,Optuna是一个基于树状结构帕曾估计器(TPE)和剪枝策略的改进贝叶斯超参数优化框架。该模型在预测性能和特征约简率方面具有明显优势和通用性,也证明适用于高温超导体 的预测。它为数据驱动的超导体研究提供了一种高效且经济的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/ad1e37571dd1/ao2c06324_0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/a7d6072ee9ab/ao2c06324_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/4f7bad73f8ab/ao2c06324_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/9ab3c11512e1/ao2c06324_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/246a34ac04dd/ao2c06324_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/de577db794a3/ao2c06324_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/ad1e37571dd1/ao2c06324_0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/a7d6072ee9ab/ao2c06324_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/4f7bad73f8ab/ao2c06324_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/9ab3c11512e1/ao2c06324_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/246a34ac04dd/ao2c06324_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/de577db794a3/ao2c06324_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da13/9878623/ad1e37571dd1/ao2c06324_0007.jpg

相似文献

1
Prediction of the Critical Temperature of Superconductors Based on Two-Layer Feature Selection and the Optuna-Stacking Ensemble Learning Model.基于双层特征选择和Optuna-Stacking集成学习模型的超导体临界温度预测
ACS Omega. 2023 Jan 13;8(3):3078-3090. doi: 10.1021/acsomega.2c06324. eCollection 2023 Jan 24.
2
Tree-Based Machine Learning Models with Optuna in Predicting Impedance Values for Circuit Analysis.基于树的机器学习模型与Optuna用于预测电路分析的阻抗值。
Micromachines (Basel). 2023 Jan 20;14(2):265. doi: 10.3390/mi14020265.
3
A new hybrid ensemble machine-learning model for severity risk assessment and post-COVID prediction system.一种新的混合集成机器学习模型,用于严重程度风险评估和 COVID 后预测系统。
Math Biosci Eng. 2022 Apr 13;19(6):6102-6123. doi: 10.3934/mbe.2022285.
4
Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合,以预测放射性肺损伤。
Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.
5
Stratification of malignant renal neoplasms from cystic renal lesions using deep learning and radiomics features based on a stacking ensemble CT machine learning algorithm.基于堆叠集成CT机器学习算法,利用深度学习和影像组学特征对囊性肾病变中的恶性肾肿瘤进行分层。
Front Oncol. 2022 Oct 25;12:1028577. doi: 10.3389/fonc.2022.1028577. eCollection 2022.
6
pSuc-FFSEA: Predicting Lysine Succinylation Sites in Proteins Based on Feature Fusion and Stacking Ensemble Algorithm.pSuc-FFSEA:基于特征融合和堆叠集成算法预测蛋白质中的赖氨酸琥珀酰化位点
Front Cell Dev Biol. 2022 May 24;10:894874. doi: 10.3389/fcell.2022.894874. eCollection 2022.
7
A novel hybrid feature selection strategy in quantitative analysis of laser-induced breakdown spectroscopy.激光诱导击穿光谱定量分析中的一种新型混合特征选择策略。
Anal Chim Acta. 2019 Nov 8;1080:35-42. doi: 10.1016/j.aca.2019.07.012. Epub 2019 Jul 9.
8
Diabetic Retinopathy Prediction by Ensemble Learning Based on Biochemical and Physical Data.基于生化和物理数据的集成学习糖尿病视网膜病变预测。
Sensors (Basel). 2021 May 25;21(11):3663. doi: 10.3390/s21113663.
9
A Hybrid Feature Selection and Ensemble Approach to Identify Depressed Users in Online Social Media.一种用于识别在线社交媒体中抑郁用户的混合特征选择与集成方法。
Front Psychol. 2022 Jan 18;12:802821. doi: 10.3389/fpsyg.2021.802821. eCollection 2021.
10
Ensemble Learning for Early-Response Prediction of Antidepressant Treatment in Major Depressive Disorder.基于集成学习的重度抑郁症抗抑郁治疗早期反应预测。
J Magn Reson Imaging. 2020 Jul;52(1):161-171. doi: 10.1002/jmri.27029. Epub 2019 Dec 20.

本文引用的文献

1
An Improvised Machine Learning Model Based on Mutual Information Feature Selection Approach for Microbes Classification.一种基于互信息特征选择方法的用于微生物分类的简易机器学习模型。
Entropy (Basel). 2021 Feb 23;23(2):257. doi: 10.3390/e23020257.
2
Pearson Correlation-Based Feature Selection for Document Classification Using Balanced Training.基于 Pearson 相关系数的特征选择在平衡训练下用于文档分类。
Sensors (Basel). 2020 Nov 27;20(23):6793. doi: 10.3390/s20236793.
3
Superconductivity at 250 K in lanthanum hydride under high pressures.
在高压下氢化镧中的 250 K 超导电性。
Nature. 2019 May;569(7757):528-531. doi: 10.1038/s41586-019-1201-8. Epub 2019 May 22.
4
Atomistic clustering-ordering and high-strain deformation of an Al0.1CrCoFeNi high-entropy alloy.Al0.1CrCoFeNi高熵合金的原子团簇有序化与高应变变形
Sci Rep. 2016 Aug 8;6:31028. doi: 10.1038/srep31028.
5
Generalized Gradient Approximation Made Simple.广义梯度近似简化法
Phys Rev Lett. 1996 Oct 28;77(18):3865-3868. doi: 10.1103/PhysRevLett.77.3865.