Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA.
Privacy Analytics Inc, Ottawa, Ontario, Canada.
J Am Med Inform Assoc. 2019 Dec 1;26(12):1536-1544. doi: 10.1093/jamia/ocz114.
Clinical corpora can be deidentified using a combination of machine-learned automated taggers and hiding in plain sight (HIPS) resynthesis. The latter replaces detected personally identifiable information (PII) with random surrogates, allowing leaked PII to blend in or "hide in plain sight." We evaluated the extent to which a malicious attacker could expose leaked PII in such a corpus.
We modeled a scenario where an institution (the defender) externally shared an 800-note corpus of actual outpatient clinical encounter notes from a large, integrated health care delivery system in Washington State. These notes were deidentified by a machine-learned PII tagger and HIPS resynthesis. A malicious attacker obtained and performed a parrot attack intending to expose leaked PII in this corpus. Specifically, the attacker mimicked the defender's process by manually annotating all PII-like content in half of the released corpus, training a PII tagger on these data, and using the trained model to tag the remaining encounter notes. The attacker hypothesized that untagged identifiers would be leaked PII, discoverable by manual review. We evaluated the attacker's success using measures of leak-detection rate and accuracy.
The attacker correctly hypothesized that 211 (68%) of 310 actual PII leaks in the corpus were leaks, and wrongly hypothesized that 191 resynthesized PII instances were also leaks. One-third of actual leaks remained undetected.
A malicious parrot attack to reveal leaked PII in clinical text deidentified by machine-learned HIPS resynthesis can attenuate but not eliminate the protective effect of HIPS deidentification.
临床语料库可以通过机器学习自动标记器和隐藏在明处(HIPS)重新合成的组合进行去识别。后者用随机替身替换检测到的个人身份信息(PII),从而允许泄露的 PII 混合或“隐藏在明处”。我们评估了恶意攻击者在这样的语料库中暴露泄露的 PII 的程度。
我们模拟了一种情况,即一个机构(防御者)从华盛顿州一个大型综合医疗服务提供系统外部共享了 800 份实际门诊临床就诊记录的语料库。这些记录通过机器学习 PII 标记器和 HIPS 重新合成进行了去识别。恶意攻击者获取并执行了鹦鹉攻击,旨在暴露该语料库中泄露的 PII。具体来说,攻击者通过手动注释发布语料库中一半的所有类似 PII 的内容来模拟防御者的过程,在这些数据上训练 PII 标记器,并使用训练好的模型标记其余的就诊记录。攻击者假设未标记的标识符将是可通过手动审查发现的泄露 PII。我们使用泄漏检测率和准确性的度量来评估攻击者的成功。
攻击者正确地假设语料库中有 310 个实际 PII 泄漏中有 211 个(68%)是泄漏,错误地假设有 191 个重新合成的 PII 实例也是泄漏。三分之一的实际泄漏未被检测到。
对通过机器学习 HIPS 重新合成去识别的临床文本中的泄露 PII 进行恶意鹦鹉攻击可以削弱但不能消除 HIPS 去识别的保护效果。