• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

佐什:一种利用 Shapley 加法值的新型特征选择方法,适用于医疗保健领域的机器学习应用。

Zoish: A Novel Feature Selection Approach Leveraging Shapley Additive Values for Machine Learning Applications in Healthcare.

机构信息

Scripps Research Translational Institute, and Department of Integrative Structural and Computational Biology, Scripps Research, La Jolla, CA 92037, USA www.scripps.edu,

出版信息

Pac Symp Biocomput. 2024;29:81-95.

PMID:38160271
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10764073/
Abstract

In the intricate landscape of healthcare analytics, effective feature selection is a prerequisite for generating robust predictive models, especially given the common challenges of sample sizes and potential biases. Zoish uniquely addresses these issues by employing Shapley additive values-an idea rooted in cooperative game theory-to enable both transparent and automated feature selection. Unlike existing tools, Zoish is versatile, designed to seamlessly integrate with an array of machine learning libraries including scikit-learn, XGBoost, CatBoost, and imbalanced-learn.The distinct advantage of Zoish lies in its dual algorithmic approach for calculating Shapley values, allowing it to efficiently manage both large and small datasets. This adaptability renders it exceptionally suitable for a wide spectrum of healthcare-related tasks. The tool also places a strong emphasis on interpretability, providing comprehensive visualizations for analyzed features. Its customizable settings offer users fine-grained control over feature selection, thus optimizing for specific predictive objectives.This manuscript elucidates the mathematical framework underpinning Zoish and how it uniquely combines local and global feature selection into a single, streamlined process. To validate Zoish's efficiency and adaptability, we present case studies in breast cancer prediction and Montreal Cognitive Assessment (MoCA) prediction in Parkinson's disease, along with evaluations on 300 synthetic datasets. These applications underscore Zoish's unparalleled performance in diverse healthcare contexts and against its counterparts.

摘要

在医疗保健分析的复杂领域中,有效的特征选择是生成强大预测模型的前提,特别是考虑到样本量和潜在偏差的常见挑战。Zoish 通过采用源于合作博弈论的 Shapley 加值方法,独特地解决了这些问题,从而实现透明和自动化的特征选择。与现有工具不同,Zoish 用途广泛,旨在与包括 scikit-learn、XGBoost、CatBoost 和 imbalanced-learn 在内的各种机器学习库无缝集成。Zoish 的独特优势在于其计算 Shapley 值的双重算法方法,使其能够有效地管理大型和小型数据集。这种适应性使其非常适合广泛的医疗保健相关任务。该工具还非常强调可解释性,为分析的特征提供全面的可视化。其可定制的设置为用户提供了对特征选择的精细控制,从而针对特定的预测目标进行优化。本文阐述了 Zoish 的数学框架以及它如何将局部和全局特征选择独特地结合到一个单一、简化的流程中。为了验证 Zoish 的效率和适应性,我们在乳腺癌预测和帕金森病中的蒙特利尔认知评估(MoCA)预测中展示了案例研究,并在 300 个合成数据集上进行了评估。这些应用强调了 Zoish 在各种医疗保健环境中的无与伦比的性能,以及与其他工具的对比优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/ba29fcb26512/nihms-1952171-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/12bf9c5e71fc/nihms-1952171-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/f56aa958aaa8/nihms-1952171-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/df566be16416/nihms-1952171-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/a434753cd09d/nihms-1952171-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/ba29fcb26512/nihms-1952171-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/12bf9c5e71fc/nihms-1952171-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/f56aa958aaa8/nihms-1952171-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/df566be16416/nihms-1952171-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/a434753cd09d/nihms-1952171-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b4cd/10764073/ba29fcb26512/nihms-1952171-f0005.jpg

相似文献

1
Zoish: A Novel Feature Selection Approach Leveraging Shapley Additive Values for Machine Learning Applications in Healthcare.佐什:一种利用 Shapley 加法值的新型特征选择方法,适用于医疗保健领域的机器学习应用。
Pac Symp Biocomput. 2024;29:81-95.
2
NeuroPred-FRL: an interpretable prediction model for identifying neuropeptide using feature representation learning.神经肽预测模型 FRL:基于特征表示学习的神经肽识别可解释预测模型。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab167.
3
Combining handcrafted features with latent variables in machine learning for prediction of radiation-induced lung damage.将机器学习中的手工特征与潜在变量相结合,以预测放射性肺损伤。
Med Phys. 2019 May;46(5):2497-2511. doi: 10.1002/mp.13497. Epub 2019 Apr 8.
4
Interpretable prediction of mortality in liver transplant recipients based on machine learning.基于机器学习的肝移植受者死亡率可解释预测。
Comput Biol Med. 2022 Dec;151(Pt A):106188. doi: 10.1016/j.compbiomed.2022.106188. Epub 2022 Oct 12.
5
A novel framework for enhancing transparency in credit scoring: Leveraging Shapley values for interpretable credit scorecards.一种增强信用评分透明度的新框架:利用 Shapley 值构建可解释的信用评分卡。
PLoS One. 2024 Aug 12;19(8):e0308718. doi: 10.1371/journal.pone.0308718. eCollection 2024.
6
Optimized machine learning methods for prediction of cognitive outcome in Parkinson's disease.优化机器学习方法预测帕金森病认知结局。
Comput Biol Med. 2019 Aug;111:103347. doi: 10.1016/j.compbiomed.2019.103347. Epub 2019 Jun 28.
7
Explaining multivariate molecular diagnostic tests via Shapley values.通过 Shapley 值解释多变量分子诊断测试。
BMC Med Inform Decis Mak. 2021 Jul 8;21(1):211. doi: 10.1186/s12911-021-01569-9.
8
Detection of medications associated with Alzheimer's disease using ensemble methods and cooperative game theory.使用集成方法和合作博弈论检测与阿尔茨海默病相关的药物。
Int J Med Inform. 2020 Sep;141:104142. doi: 10.1016/j.ijmedinf.2020.104142. Epub 2020 May 24.
9
Predicting Fetal Alcohol Spectrum Disorders Using Machine Learning Techniques: Multisite Retrospective Cohort Study.使用机器学习技术预测胎儿酒精谱系障碍:多地点回顾性队列研究。
J Med Internet Res. 2023 Jul 18;25:e45041. doi: 10.2196/45041.
10
Evaluating the Potential of Machine Learning and Wearable Devices in End-of-Life Care in Predicting 7-Day Death Events Among Patients With Terminal Cancer: Cohort Study.评估机器学习和可穿戴设备在预测终末期癌症患者 7 天内死亡事件中的应用潜力:队列研究。
J Med Internet Res. 2023 Aug 18;25:e47366. doi: 10.2196/47366.

本文引用的文献

1
Genetically-informed prediction of short-term Parkinson's disease progression.帕金森病短期进展的遗传信息预测
NPJ Parkinsons Dis. 2022 Oct 28;8(1):143. doi: 10.1038/s41531-022-00412-w.
2
Scalable Nearest Neighbor Algorithms for High Dimensional Data.高维数据的可扩展最近邻算法。
IEEE Trans Pattern Anal Mach Intell. 2014 Nov;36(11):2227-40. doi: 10.1109/TPAMI.2014.2321376.
3
Statistical challenges of high-dimensional data.高维数据的统计挑战。
Philos Trans A Math Phys Eng Sci. 2009 Nov 13;367(1906):4237-53. doi: 10.1098/rsta.2009.0159.
4
The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment.蒙特利尔认知评估量表(MoCA):一种用于轻度认知障碍的简易筛查工具。
J Am Geriatr Soc. 2005 Apr;53(4):695-9. doi: 10.1111/j.1532-5415.2005.53221.x.
5
Machine learning techniques to diagnose breast cancer from image-processed nuclear features of fine needle aspirates.利用机器学习技术从细针穿刺抽吸物的图像处理核特征诊断乳腺癌。
Cancer Lett. 1994 Mar 15;77(2-3):163-71. doi: 10.1016/0304-3835(94)90099-x.