Suppr超能文献

佐什:一种利用 Shapley 加法值的新型特征选择方法,适用于医疗保健领域的机器学习应用。

Zoish: A Novel Feature Selection Approach Leveraging Shapley Additive Values for Machine Learning Applications in Healthcare.

机构信息

Scripps Research Translational Institute, and Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA 92037, USA www.scripps.edu,

出版信息

Pac Symp Biocomput. 2024;29:81-95.

Abstract

In the intricate landscape of healthcare analytics, effective feature selection is a prerequisite for generating robust predictive models, especially given the common challenges of sample sizes and potential biases. Zoish uniquely addresses these issues by employing Shapley additive values-an idea rooted in cooperative game theory-to enable both transparent and automated feature selection. Unlike existing tools, Zoish is versatile, designed to seamlessly integrate with an array of machine learning libraries including scikit-learn, XGBoost, CatBoost, and imbalanced-learn.The distinct advantage of Zoish lies in its dual algorithmic approach for calculating Shapley values, allowing it to efficiently manage both large and small datasets. This adaptability renders it exceptionally suitable for a wide spectrum of healthcare-related tasks. The tool also places a strong emphasis on interpretability, providing comprehensive visualizations for analyzed features. Its customizable settings offer users fine-grained control over feature selection, thus optimizing for specific predictive objectives.This manuscript elucidates the mathematical framework underpinning Zoish and how it uniquely combines local and global feature selection into a single, streamlined process. To validate Zoish's efficiency and adaptability, we present case studies in breast cancer prediction and Montreal Cognitive Assessment (MoCA) prediction in Parkinson's disease, along with evaluations on 300 synthetic datasets. These applications underscore Zoish's unparalleled performance in diverse healthcare contexts and against its counterparts.

摘要

在医疗保健分析的复杂领域中,有效的特征选择是生成强大预测模型的前提,特别是考虑到样本量和潜在偏差的常见挑战。Zoish 通过采用源于合作博弈论的 Shapley 加值方法,独特地解决了这些问题,从而实现透明和自动化的特征选择。与现有工具不同,Zoish 用途广泛,旨在与包括 scikit-learn、XGBoost、CatBoost 和 imbalanced-learn 在内的各种机器学习库无缝集成。Zoish 的独特优势在于其计算 Shapley 值的双重算法方法,使其能够有效地管理大型和小型数据集。这种适应性使其非常适合广泛的医疗保健相关任务。该工具还非常强调可解释性,为分析的特征提供全面的可视化。其可定制的设置为用户提供了对特征选择的精细控制,从而针对特定的预测目标进行优化。本文阐述了 Zoish 的数学框架以及它如何将局部和全局特征选择独特地结合到一个单一、简化的流程中。为了验证 Zoish 的效率和适应性,我们在乳腺癌预测和帕金森病中的蒙特利尔认知评估(MoCA)预测中展示了案例研究,并在 300 个合成数据集上进行了评估。这些应用强调了 Zoish 在各种医疗保健环境中的无与伦比的性能,以及与其他工具的对比优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/12bf9c5e71fc/nihms-1952171-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验