利用缺失值模式对电子健康记录机器学习模型进行后门攻击：开发与验证研究

Exploiting Missing Value Patterns for a Backdoor Attack on Machine Learning Models of Electronic Health Records: Development and Validation Study.

作者信息

Joe Byunggill, Park Yonghyeon, Hamm Jihun, Shin Insik, Lee Jiyeon

机构信息

School of Computing, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea.

An affiliated institute of Electronics and Telecommunications Research Institute, Daejeon, Republic of Korea.

出版信息

JMIR Med Inform. 2022 Aug 19;10(8):e38440. doi: 10.2196/38440.

DOI:10.2196/38440

PMID:35984701

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9440413/

Abstract

BACKGROUND

A backdoor attack controls the output of a machine learning model in 2 stages. First, the attacker poisons the training data set, introducing a back door into the victim's trained model. Second, during test time, the attacker adds an imperceptible pattern called a trigger to the input values, which forces the victim's model to output the attacker's intended values instead of true predictions or decisions. While backdoor attacks pose a serious threat to the reliability of machine learning-based medical diagnostics, existing backdoor attacks that directly change the input values are detectable relatively easily.

OBJECTIVE

The goal of this study was to propose and study a robust backdoor attack on mortality-prediction machine learning models that use electronic health records. We showed that our backdoor attack grants attackers full control over classification outcomes for safety-critical tasks such as mortality prediction, highlighting the importance of undertaking safe artificial intelligence research in the medical field.

METHODS

We present a trigger generation method based on missing patterns in electronic health record data. Compared to existing approaches, which introduce noise into the medical record, the proposed backdoor attack makes it simple to construct backdoor triggers without prior knowledge. To effectively avoid detection by manual inspectors, we employ variational autoencoders to learn the missing patterns in normal electronic health record data and produce trigger data that appears similar to this data.

RESULTS

We experimented with the proposed backdoor attack on 4 machine learning models (linear regression, multilayer perceptron, long short-term memory, and gated recurrent units) that predict in-hospital mortality using a public electronic health record data set. The results showed that the proposed technique achieved a significant drop in the victim's discrimination performance (reducing the area under the precision-recall curve by at most 0.45), with a low poisoning rate (2%) in the training data set. In addition, the impact of the attack on general classification performance was negligible (it reduced the area under the precision-recall curve by an average of 0.01025), which makes it difficult to detect the presence of poison.

CONCLUSIONS

To the best of our knowledge, this is the first study to propose a backdoor attack that uses missing information from tabular data as a trigger. Through extensive experiments, we demonstrated that our backdoor attack can inflict severe damage on medical machine learning classifiers in practice.

摘要

背景

后门攻击分两个阶段控制机器学习模型的输出。首先，攻击者污染训练数据集，在受害者的训练模型中引入后门。其次，在测试阶段，攻击者向输入值添加一个称为触发器的不可察觉模式，这会迫使受害者的模型输出攻击者预期的值，而不是真实的预测或决策。虽然后门攻击对基于机器学习的医学诊断的可靠性构成严重威胁，但现有的直接改变输入值的后门攻击相对容易被检测到。

目的

本研究的目标是针对使用电子健康记录的死亡率预测机器学习模型提出并研究一种强大的后门攻击。我们表明，我们的后门攻击使攻击者能够完全控制诸如死亡率预测等安全关键任务的分类结果，凸显了在医学领域进行安全人工智能研究的重要性。

方法

我们提出了一种基于电子健康记录数据中缺失模式的触发器生成方法。与现有的将噪声引入病历的方法相比，所提出的后门攻击无需先验知识即可轻松构建后门触发器。为了有效避免人工检查人员的检测，我们使用变分自编码器来学习正常电子健康记录数据中的缺失模式，并生成与该数据相似的触发数据。

结果

我们使用一个公共电子健康记录数据集，对4种预测住院死亡率的机器学习模型（线性回归、多层感知器、长短期记忆和门控循环单元）进行了所提出的后门攻击实验。结果表明，所提出的技术使受害者的判别性能显著下降（精确率-召回率曲线下面积最多降低0.45），而训练数据集中的中毒率较低（2%）。此外，攻击对一般分类性能的影响可以忽略不计（精确率-召回率曲线下面积平均降低0.01025），这使得很难检测到中毒的存在。

结论

据我们所知，这是第一项提出使用表格数据中的缺失信息作为触发器的后门攻击的研究。通过广泛的实验，我们证明了我们的后门攻击在实践中会对医学机器学习分类器造成严重损害。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用缺失值模式对电子健康记录机器学习模型进行后门攻击：开发与验证研究

Exploiting Missing Value Patterns for a Backdoor Attack on Machine Learning Models of Electronic Health Records: Development and Validation Study.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

利用缺失值模式对电子健康记录机器学习模型进行后门攻击：开发与验证研究

Exploiting Missing Value Patterns for a Backdoor Attack on Machine Learning Models of Electronic Health Records: Development and Validation Study.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献