Suppr超能文献

使用特征重要性对具有多个特征的数据集的机器学习模型进行解释。

Interpretation of Machine Learning Models for Data Sets with Many Features Using Feature Importance.

作者信息

Kaneko Hiromasa

机构信息

Department of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan.

出版信息

ACS Omega. 2023 Jun 14;8(25):23218-23225. doi: 10.1021/acsomega.3c03722. eCollection 2023 Jun 27.

Abstract

Feature importance (FI) is used to interpret the machine learning model = () constructed between the explanatory variables or features, , and the objective variables, . For a large number of features, interpreting the model in the order of increasing FI is inefficient when there are similarly important features. Therefore, in this study, a method is developed to interpret models by considering the similarities between the features in addition to the FI. The cross-validated permutation feature importance (CVPFI), which can be calculated using any machine learning method and can handle multicollinearity problems, is used as the FI, while the absolute correlation and maximal information coefficients are used as metrics of feature similarity. Machine learning models could be effectively interpreted by considering the features from the Pareto fronts, where CVPFI is large and the feature similarity is small. Analyses of actual molecular and material data sets confirm that the proposed method enables the accurate interpretation of machine learning models.

摘要

特征重要性(FI)用于解释在解释变量或特征(x)与目标变量(y)之间构建的机器学习模型(y = f(x))。对于大量特征而言,当存在相似重要性的特征时,按FI递增顺序解释模型效率低下。因此,在本研究中,开发了一种除FI外还考虑特征间相似性来解释模型的方法。可使用任何机器学习方法计算且能处理多重共线性问题的交叉验证排列特征重要性(CVPFI)用作FI,而绝对相关性和最大信息系数用作特征相似性的度量。通过考虑帕累托前沿的特征(CVPFI大且特征相似性小),可以有效地解释机器学习模型。对实际分子和材料数据集的分析证实,所提出的方法能够准确解释机器学习模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6575/10308517/48891069e9ba/ao3c03722_0002.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验