Wiedeman Christopher, Wang Ge
Department of Electrical and Computer Systems Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
Department of Biomedical Engineering, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
Patterns (N Y). 2024 Dec 9;5(12):101116. doi: 10.1016/j.patter.2024.101116. eCollection 2024 Dec 13.
To achieve adequate trust in patient-critical medical tasks, artificial intelligence must be able to recognize instances where they cannot operate confidently. Ensemble methods are deployed to estimate uncertainty, but models in an ensemble often share the same vulnerabilities to adversarial attacks. We propose an ensemble approach based on feature decorrelation and Fourier partitioning for teaching networks diverse features, reducing the chance of perturbation-based fooling. We test our approach against white-box attacks in single- and multi-channel electrocardiogram classification and adapt adversarial training and DVERGE into an ensemble framework for comparison. Our results indicate that the combination of decorrelation and Fourier partitioning maintains performance on unperturbed data while demonstrating superior uncertainty estimation on projected gradient descent and smooth adversarial attacks of various magnitudes. Furthermore, our approach does not require expensive optimization with adversarial samples during training. These methods can be applied to other tasks for more robust models.
为了在关乎患者的关键医疗任务中获得足够的信任,人工智能必须能够识别出它们无法自信运行的情况。采用集成方法来估计不确定性,但集成中的模型通常对对抗攻击存在相同的漏洞。我们提出了一种基于特征去相关和傅里叶划分的集成方法,用于教导网络不同的特征,减少基于扰动的欺骗机会。我们在单通道和多通道心电图分类中针对白盒攻击测试了我们的方法,并将对抗训练和DVERGE改编到一个集成框架中进行比较。我们的结果表明,去相关和傅里叶划分的组合在未受扰动的数据上保持性能,同时在各种幅度的投影梯度下降和平滑对抗攻击上展示出卓越的不确定性估计。此外,我们的方法在训练期间不需要使用对抗样本进行昂贵的优化。这些方法可应用于其他任务以获得更稳健的模型。