基于机器学习的分类器稳健性测试框架

Framework for Testing Robustness of Machine Learning-Based Classifiers.

作者信息

Chuah Joshua, Kruger Uwe, Wang Ge, Yan Pingkun, Hahn Juergen

机构信息

Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.

出版信息

J Pers Med. 2022 Aug 14;12(8):1314. doi: 10.3390/jpm12081314.

DOI:10.3390/jpm12081314

PMID:36013263

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9409965/

Abstract

There has been a rapid increase in the number of artificial intelligence (AI)/machine learning (ML)-based biomarker diagnostic classifiers in recent years. However, relatively little work has focused on assessing the robustness of these biomarkers, i.e., investigating the uncertainty of the AI/ML models that these biomarkers are based upon. This paper addresses this issue by proposing a framework to evaluate the already-developed classifiers with regard to their robustness by focusing on the variability of the classifiers' performance and changes in the classifiers' parameter values using factor analysis and Monte Carlo simulations. Specifically, this work evaluates (1) the importance of a classifier's input features and (2) the variability of a classifier's output and model parameter values in response to data perturbations. Additionally, it was found that one can estimate a priori how much replacement noise a classifier can tolerate while still meeting accuracy goals. To illustrate the evaluation framework, six different AI/ML-based biomarkers are developed using commonly used techniques (linear discriminant analysis, support vector machines, random forest, partial-least squares discriminant analysis, logistic regression, and multilayer perceptron) for a metabolomics dataset involving 24 measured metabolites taken from 159 study participants. The framework was able to correctly predict which of the classifiers should be less robust than others without recomputing the classifiers itself, and this prediction was then validated in a detailed analysis.

摘要

近年来，基于人工智能（AI）/机器学习（ML）的生物标志物诊断分类器的数量迅速增加。然而，相对较少的工作集中在评估这些生物标志物的稳健性，即研究这些生物标志物所基于的AI/ML模型的不确定性。本文通过提出一个框架来解决这个问题，该框架通过使用因子分析和蒙特卡罗模拟，关注分类器性能的可变性和分类器参数值的变化，来评估已开发的分类器的稳健性。具体而言，这项工作评估了（1）分类器输入特征的重要性，以及（2）分类器输出和模型参数值响应数据扰动的可变性。此外，研究发现，可以事先估计分类器在仍能达到准确性目标的情况下能够容忍多少替换噪声。为了说明评估框架，使用常用技术（线性判别分析、支持向量机、随机森林、偏最小二乘判别分析、逻辑回归和多层感知器）为一个代谢组学数据集开发了六种不同的基于AI/ML的生物标志物，该数据集包含从159名研究参与者身上测得的24种代谢物。该框架能够在不重新计算分类器本身的情况下正确预测哪些分类器比其他分类器的稳健性更低，然后在详细分析中对这一预测进行了验证。