Suppr超能文献

可解释机器学习原理与实践

Principles and Practice of Explainable Machine Learning.

作者信息

Belle Vaishak, Papantonis Ioannis

机构信息

School of Informatics, University of Edinburgh, Edinburgh, United Kingdom.

Alan Turing Institute, London, United Kingdom.

出版信息

Front Big Data. 2021 Jul 1;4:688969. doi: 10.3389/fdata.2021.688969. eCollection 2021.

Abstract

Artificial intelligence (AI) provides many opportunities to improve private and public life. Discovering patterns and structures in large troves of data in an automated manner is a core component of data science, and currently drives applications in diverse areas such as computational biology, law and finance. However, such a highly positive impact is coupled with a significant challenge: how do we understand the decisions suggested by these systems in order that we can trust them? In this report, we focus specifically on data-driven methods-machine learning (ML) and pattern recognition models in particular-so as to survey and distill the results and observations from the literature. The purpose of this report can be especially appreciated by noting that ML models are increasingly deployed in a wide range of businesses. However, with the increasing prevalence and complexity of methods, business stakeholders in the very least have a growing number of concerns about the drawbacks of models, data-specific biases, and so on. Analogously, data science practitioners are often not aware about approaches emerging from the academic literature or may struggle to appreciate the differences between different methods, so end up using industry standards such as SHAP. Here, we have undertaken a survey to help industry practitioners (but also data scientists more broadly) understand the field of explainable machine learning better and apply the right tools. Our latter sections build a narrative around a putative data scientist, and discuss how she might go about explaining her models by asking the right questions. From an organization viewpoint, after motivating the area broadly, we discuss the main developments, including the principles that allow us to study transparent models vs. opaque models, as well as model-specific or model-agnostic post-hoc explainability approaches. We also briefly reflect on deep learning models, and conclude with a discussion about future research directions.

摘要

人工智能(AI)为改善私人生活和公共生活提供了诸多机遇。以自动化方式在大量数据中发现模式和结构是数据科学的核心组成部分,目前推动着计算生物学、法律和金融等不同领域的应用。然而,这种高度积极的影响伴随着一个重大挑战:我们如何理解这些系统给出的决策,以便能够信任它们?在本报告中,我们特别关注数据驱动方法——尤其是机器学习(ML)和模式识别模型——以便审视和提炼文献中的结果与观察。通过注意到ML模型在广泛的企业中越来越多地得到部署,本报告的目的能得到特别的理解。然而,随着方法的日益普及和复杂,企业利益相关者至少对模型的缺点、特定数据偏差等越来越多的担忧。类似地,数据科学从业者往往不了解学术文献中出现的方法,或者可能难以理解不同方法之间的差异,所以最终使用诸如SHAP之类的行业标准。在此,我们进行了一项调查,以帮助行业从业者(以及更广泛意义上的数据科学家)更好地理解可解释机器学习领域,并应用正确的工具。我们后面的章节围绕一位假定的数据科学家构建了一个叙述,并讨论她如何通过提出正确的问题来解释她的模型。从组织的角度来看,在广泛激发对该领域的兴趣之后,我们讨论了主要进展,包括使我们能够研究透明模型与不透明模型的原则,以及特定于模型或与模型无关的事后可解释性方法。我们还简要思考了深度学习模型,并以关于未来研究方向的讨论作为结论。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8dc9/8281957/beaa86df9ffc/fdata-04-688969-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验