• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Enabling realistic health data re-identification risk assessment through adversarial modeling.通过对抗建模实现现实健康数据重新识别风险评估。
J Am Med Inform Assoc. 2021 Mar 18;28(4):744-752. doi: 10.1093/jamia/ocaa327.
2
A unified framework for evaluating the risk of re-identification of text de-identification tools.用于评估文本去识别工具重新识别风险的统一框架。
J Biomed Inform. 2016 Oct;63:174-183. doi: 10.1016/j.jbi.2016.07.015. Epub 2016 Jul 15.
3
Optimizing annotation resources for natural language de-identification via a game theoretic framework.通过博弈论框架优化用于自然语言去识别的注释资源。
J Biomed Inform. 2016 Jun;61:97-109. doi: 10.1016/j.jbi.2016.03.019. Epub 2016 Mar 25.
4
Health Data Re-Identification: Assessing Adversaries and Potential Harms.健康数据再识别:评估对手和潜在危害。
Stud Health Technol Inform. 2024 Aug 22;316:1199-1203. doi: 10.3233/SHTI240626.
5
Privacy of Study Participants in Open-access Health and Demographic Surveillance System Data: Requirements Analysis for Data Anonymization.开放获取健康和人口监测系统数据中研究参与者的隐私:数据匿名化的需求分析。
JMIR Public Health Surveill. 2022 Sep 2;8(9):e34472. doi: 10.2196/34472.
6
Proposal and Assessment of a De-Identification Strategy to Enhance Anonymity of the Observational Medical Outcomes Partnership Common Data Model (OMOP-CDM) in a Public Cloud-Computing Environment: Anonymization of Medical Data Using Privacy Models.在公共云计算环境中增强观察性医疗结局伙伴关系通用数据模型(OMOP-CDM)匿名性的去标识策略的提出与评估:使用隐私模型对医疗数据进行匿名化。
J Med Internet Res. 2020 Nov 26;22(11):e19597. doi: 10.2196/19597.
7
Anonymization Through Data Synthesis Using Generative Adversarial Networks (ADS-GAN).基于生成对抗网络的数据合成匿名化(ADS-GAN)。
IEEE J Biomed Health Inform. 2020 Aug;24(8):2378-2388. doi: 10.1109/JBHI.2020.2980262. Epub 2020 Mar 12.
8
The machine giveth and the machine taketh away: a parrot attack on clinical text deidentified with hiding in plain sight.机器给予,机器又夺走:隐藏在明处的鹦鹉攻击对临床文本去识别。
J Am Med Inform Assoc. 2019 Dec 1;26(12):1536-1544. doi: 10.1093/jamia/ocz114.
9
Reducing patient re-identification risk for laboratory results within research datasets.降低研究数据集内实验室结果的患者再识别风险。
J Am Med Inform Assoc. 2013 Jan 1;20(1):95-101. doi: 10.1136/amiajnl-2012-001026. Epub 2012 Jul 21.
10
Use and Understanding of Anonymization and De-Identification in the Biomedical Literature: Scoping Review.生物医学文献中匿名化和去识别化的使用与理解:范围综述
J Med Internet Res. 2019 May 31;21(5):e13484. doi: 10.2196/13484.

引用本文的文献

1
Practical and ready-to-use methodology to assess the re-identification risk in anonymized datasets.评估匿名数据集重新识别风险的实用且现成的方法。
Sci Rep. 2025 Jul 2;15(1):23223. doi: 10.1038/s41598-025-04907-3.
2
A survey on UK researchers' views regarding their experiences with the de-identification, anonymisation, release methods and re-identification risk estimation for clinical trial datasets.一项关于英国研究人员对临床试验数据集的去识别化、匿名化、发布方法及重新识别风险评估经验的看法的调查。
Clin Trials. 2025 Feb;22(1):11-23. doi: 10.1177/17407745241259086. Epub 2024 Jun 19.
3
A scaling law to model the effectiveness of identification techniques.一种用于模拟识别技术有效性的标度律。
Nat Commun. 2025 Jan 9;16(1):347. doi: 10.1038/s41467-024-55296-6.
4
Critical Data for Critical Care: A Primer on Leveraging Electronic Health Record Data for Research From Society of Critical Care Medicine's Panel on Data Sharing and Harmonization.危重症关键数据:利用电子健康记录数据进行研究的指南——来自危重病医学会数据共享与协调专家组
Crit Care Explor. 2024 Nov 15;6(11):e1179. doi: 10.1097/CCE.0000000000001179. eCollection 2024 Nov.
5
Privacy-Enhancing Technologies in Biomedical Data Science.生物医学数据科学中的隐私增强技术。
Annu Rev Biomed Data Sci. 2024 Aug;7(1):317-343. doi: 10.1146/annurev-biodatasci-120423-120107.
6
Reidentification of Participants in Shared Clinical Data Sets: Experimental Study.共享临床数据集参与者的重新识别:实验研究
JMIR AI. 2024 Mar 15;3:e52054. doi: 10.2196/52054.
7
Ethical Imperatives for Working With Diverse Populations in Digital Research.与数字研究中的不同人群合作的伦理准则。
J Med Internet Res. 2023 Sep 18;25:e47884. doi: 10.2196/47884.
8
A guide to sharing open healthcare data under the General Data Protection Regulation.《通用数据保护条例》下开放医疗保健数据共享指南。
Sci Data. 2023 Jun 24;10(1):404. doi: 10.1038/s41597-023-02256-2.
9
Supporting COVID-19 Disparity Investigations with Dynamically Adjusting Case Reporting Policies.用动态调整病例报告政策支持 COVID-19 差异调查。
AMIA Annu Symp Proc. 2023 Apr 29;2022:279-288. eCollection 2022.
10
Report of the Medical Image De-Identification (MIDI) Task Group -- Best Practices and Recommendations.医学图像去识别化(MIDI)任务组报告——最佳实践与建议
ArXiv. 2025 Mar 16:arXiv:2303.10473v3.

本文引用的文献

1
The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment.国家 COVID 队列协作组织(N3C):原理、设计、基础设施和部署。
J Am Med Inform Assoc. 2021 Mar 1;28(3):427-443. doi: 10.1093/jamia/ocaa196.
2
Evaluating the re-identification risk of a clinical study report anonymized under EMA Policy 0070 and Health Canada Regulations.评估根据 EMA 政策 0070 和加拿大卫生部法规进行匿名化的临床研究报告的再识别风险。
Trials. 2020 Feb 18;21(1):200. doi: 10.1186/s13063-020-4120-y.
3
The "All of Us" Research Program.“All of Us”研究计划。
N Engl J Med. 2019 Aug 15;381(7):668-676. doi: 10.1056/NEJMsr1809937.
4
Estimating the success of re-identifications in incomplete datasets using generative models.利用生成模型估计不完全数据集重识别的成功率。
Nat Commun. 2019 Jul 23;10(1):3069. doi: 10.1038/s41467-019-10933-3.
5
Re-identification Risks in HIPAA Safe Harbor Data: A study of data from one environmental health study.《健康保险流通与责任法案》安全港数据中的重新识别风险:一项对来自一项环境卫生研究数据的研究
Technol Sci. 2017;2017. Epub 2017 Aug 28.
6
Privacy in the age of medical big data.医疗大数据时代的隐私问题。
Nat Med. 2019 Jan;25(1):37-43. doi: 10.1038/s41591-018-0272-7. Epub 2019 Jan 7.
7
Clinical Trial Participants' Views of the Risks and Benefits of Data Sharing.临床试验参与者对数据共享风险和收益的看法。
N Engl J Med. 2018 Jun 7;378(23):2202-2211. doi: 10.1056/NEJMsa1713258.
8
Sharing data under the 21st Century Cures Act.根据《21 世纪治愈法案》共享数据。
Genet Med. 2017 Dec;19(12):1289-1294. doi: 10.1038/gim.2017.59. Epub 2017 May 25.
9
How consumer physical activity monitors could transform human physiology research.消费者身体活动监测设备如何改变人体生理学研究。
Am J Physiol Regul Integr Comp Physiol. 2017 Mar 1;312(3):R358-R367. doi: 10.1152/ajpregu.00349.2016. Epub 2017 Jan 4.
10
Automated integration of continuous glucose monitor data in the electronic health record using consumer technology.利用消费技术将连续血糖监测数据自动整合到电子健康记录中。
J Am Med Inform Assoc. 2016 May;23(3):532-7. doi: 10.1093/jamia/ocv206. Epub 2016 Mar 27.

通过对抗建模实现现实健康数据重新识别风险评估。

Enabling realistic health data re-identification risk assessment through adversarial modeling.

机构信息

Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, Tennessee, USA.

Center for Genetic Privacy and Identity in Community Settings, Vanderbilt University Medical Center, Nashville, Tennessee, USA.

出版信息

J Am Med Inform Assoc. 2021 Mar 18;28(4):744-752. doi: 10.1093/jamia/ocaa327.

DOI:10.1093/jamia/ocaa327
PMID:33448306
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8711654/
Abstract

OBJECTIVE

Re-identification risk methods for biomedical data often assume a worst case, in which attackers know all identifiable features (eg, age and race) about a subject. Yet, worst-case adversarial modeling can overestimate risk and induce heavy editing of shared data. The objective of this study is to introduce a framework for assessing the risk considering the attacker's resources and capabilities.

MATERIALS AND METHODS

We integrate 3 established risk measures (ie, prosecutor, journalist, and marketer risks) and compute re-identification probabilities for data subjects. This probability is dependent on an attacker's capabilities (eg, ability to obtain external identified resources) and the subject's decision on whether to reveal their participation in a dataset. We illustrate the framework through case studies using data from over 1 000 000 patients from Vanderbilt University Medical Center and show how re-identification risk changes when attackers are pragmatic and use 2 known resources for attack: (1) voter registration lists and (2) social media posts.

RESULTS

Our framework illustrates that the risk is substantially smaller in the pragmatic scenarios than in the worst case. Our experiments yield a median worst-case risk of 0.987 (where 0 is least risky and 1 is most risky); however, the median reduction in risk was 90.1% in the voter registration scenario and 100% in the social media posts scenario. Notably, these observations hold true for a wide range of adversarial capabilities.

CONCLUSIONS

This research illustrates that re-identification risk is situationally dependent and that appropriate adversarial modeling may permit biomedical data sharing on a wider scale than is currently the case.

摘要

目的

生物医学数据的再识别风险方法通常假设攻击者了解有关主体的所有可识别特征(例如年龄和种族)的最坏情况。然而,最坏情况对抗建模可能会高估风险并导致共享数据的大量编辑。本研究的目的是引入一种考虑攻击者资源和能力的风险评估框架。

材料与方法

我们整合了 3 种已建立的风险度量(即检察官、记者和营销人员风险),并计算了数据主体的再识别概率。该概率取决于攻击者的能力(例如,获取外部识别资源的能力)以及主体是否决定透露他们参与数据集。我们通过使用范德比尔特大学医学中心超过 100 万患者的数据进行案例研究来说明该框架,并展示了当攻击者务实并使用 2 种已知资源进行攻击时(1)选民登记名单和(2)社交媒体帖子时,再识别风险如何变化。

结果

我们的框架表明,在实际情况下,风险明显小于最坏情况。我们的实验产生了中位数最坏情况风险为 0.987(其中 0 是风险最小,1 是风险最大);然而,在选民登记情况下,风险中位数降低了 90.1%,在社交媒体帖子情况下则降低了 100%。值得注意的是,这些观察结果适用于广泛的对抗能力。

结论

这项研究表明,再识别风险是情境相关的,适当的对抗建模可能允许比目前更广泛地共享生物医学数据。