• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

环境健康研究数据共享的隐私风险。

Privacy Risks of Sharing Data from Environmental Health Studies.

机构信息

Silent Spring Institute, Newton, Massachusetts, USA.

MIT Media Lab, Massachusetts Institute of Technology, Cambridge, Massachusetts, USA.

出版信息

Environ Health Perspect. 2020 Jan;128(1):17008. doi: 10.1289/EHP4817. Epub 2020 Jan 10.

DOI:10.1289/EHP4817
PMID:31922426
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7015543/
Abstract

BACKGROUND

Sharing research data uses resources effectively; enables large, diverse data sets; and supports rigor and reproducibility. However, sharing such data increases privacy risks for participants who may be re-identified by linking study data to outside data sets. These risks have been investigated for genetic and medical records but rarely for environmental data.

OBJECTIVES

We evaluated how data in environmental health (EH) studies may be vulnerable to linkage and we investigated, in a case study, whether environmental measurements could contribute to inferring latent categories (e.g., geographic location), which increases privacy risks.

METHODS

We identified 12 prominent EH studies, reviewed the data types collected, and evaluated the availability of outside data sets that overlap with study data. With data from the Household Exposure Study in California and Massachusetts and the Green Housing Study in Boston, Massachusetts, and Cincinnati, Ohio, we used -means clustering and principal component analysis to investigate whether participants' region of residence could be inferred from measurements of chemicals in household air and dust.

RESULTS

All 12 studies included at least two of five data types that overlap with outside data sets: geographic location (9 studies), medical data (9 studies), occupation (10 studies), housing characteristics (10 studies), and genetic data (7 studies). In our cluster analysis, participants' region of residence could be inferred with 80%-98% accuracy using environmental measurements with original laboratory reporting limits.

DISCUSSION

EH studies frequently include data that are vulnerable to linkage with voter lists, tax and real estate data, professional licensing lists, and ancestry websites, and exposure measurements may be used to identify subgroup membership, increasing likelihood of linkage. Thus, unsupervised sharing of EH research data potentially raises substantial privacy risks. Empirical research can help characterize risks and evaluate technical solutions. Our findings reinforce the need for legal and policy protections to shield participants from potential harms of re-identification from data sharing. https://doi.org/10.1289/EHP4817.

摘要

背景

分享研究数据可以有效地利用资源;支持大规模、多样化的数据集;并支持严谨性和可重复性。然而,共享此类数据会增加参与者的隐私风险,因为通过将研究数据与外部数据集进行链接,参与者可能会被重新识别。这些风险已经在遗传和医疗记录中进行了调查,但很少在环境数据中进行调查。

目的

我们评估了环境健康 (EH) 研究中的数据可能容易受到链接的程度,并在案例研究中调查了环境测量是否可能有助于推断潜在类别(例如地理位置),这会增加隐私风险。

方法

我们确定了 12 项著名的 EH 研究,审查了收集的数据类型,并评估了与研究数据重叠的外部数据集的可用性。使用加利福尼亚州和马萨诸塞州的家庭暴露研究以及马萨诸塞州波士顿和俄亥俄州辛辛那提的绿色住房研究的数据,我们使用 -均值聚类和主成分分析来调查是否可以从家庭空气中和灰尘中化学物质的测量值推断出参与者的居住地区。

结果

所有 12 项研究都至少包含与外部数据集重叠的五种数据类型中的两种:地理位置(9 项研究)、医疗数据(9 项研究)、职业(10 项研究)、住房特征(10 项研究)和遗传数据(7 项研究)。在我们的聚类分析中,使用原始实验室报告限下的环境测量值可以以 80%-98%的准确度推断出参与者的居住地区。

讨论

EH 研究经常包含易与选民名单、税务和房地产数据、专业执照名单以及祖先网站链接的数据,并且暴露测量值可用于识别亚组成员,增加链接的可能性。因此,EH 研究数据的无监督共享可能会带来重大的隐私风险。实证研究可以帮助描述风险并评估技术解决方案。我们的研究结果强化了需要法律和政策保护,以保护参与者免受数据共享可能带来的重新识别的潜在伤害。https://doi.org/10.1289/EHP4817.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1508/7015543/d3c87cc10e27/ehp-128-017008-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1508/7015543/d3c87cc10e27/ehp-128-017008-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1508/7015543/d3c87cc10e27/ehp-128-017008-g001.jpg

相似文献

1
Privacy Risks of Sharing Data from Environmental Health Studies.环境健康研究数据共享的隐私风险。
Environ Health Perspect. 2020 Jan;128(1):17008. doi: 10.1289/EHP4817. Epub 2020 Jan 10.
2
Biomedical Research Cohort Membership Disclosure on Social Media.社交媒体上生物医学研究队列成员身份的披露
AMIA Annu Symp Proc. 2020 Mar 4;2019:607-616. eCollection 2019.
3
Security controls in an integrated Biobank to protect privacy in data sharing: rationale and study design.综合生物样本库中保护数据共享隐私的安全控制措施:基本原理与研究设计。
BMC Med Inform Decis Mak. 2017 Jul 6;17(1):100. doi: 10.1186/s12911-017-0494-5.
4
GenoShare: Supporting Privacy-Informed Decisions for Sharing Individual-Level Genetic Data.基因共享:支持在保护隐私前提下共享个人层面基因数据的决策
Stud Health Technol Inform. 2020 Jun 16;270:238-241. doi: 10.3233/SHTI200158.
5
Public comprehension of privacy protections applied to health data shared for research: An Australian cross-sectional study.公众对用于研究共享的健康数据所适用的隐私保护措施的理解:一项澳大利亚的横断面研究。
Int J Med Inform. 2022 Nov;167:104859. doi: 10.1016/j.ijmedinf.2022.104859. Epub 2022 Aug 29.
6
The project data sphere initiative: accelerating cancer research by sharing data.项目数据领域计划:通过数据共享加速癌症研究
Oncologist. 2015 May;20(5):464-e20. doi: 10.1634/theoncologist.2014-0431. Epub 2015 Apr 15.
7
The spectrum of data sharing policies in neuroimaging data repositories.神经影像学数据存储库中数据共享政策的范围。
Hum Brain Mapp. 2022 Jun 1;43(8):2707-2721. doi: 10.1002/hbm.25803. Epub 2022 Feb 10.
8
Disposition toward privacy and information disclosure in the context of emerging health technologies.新兴健康技术背景下的隐私和信息披露倾向。
J Am Med Inform Assoc. 2019 Jul 1;26(7):610-619. doi: 10.1093/jamia/ocz010.
9
The benefits, risks and costs of privacy: patient preferences and willingness to pay.隐私的益处、风险与成本:患者偏好及支付意愿
Curr Med Res Opin. 2017 May;33(5):845-851. doi: 10.1080/03007995.2017.1292229. Epub 2017 Mar 12.
10
Data sharing practices of medicines related apps and the mobile ecosystem: traffic, content, and network analysis.药品相关应用程序和移动生态系统的数据共享实践:流量、内容和网络分析。
BMJ. 2019 Mar 20;364:l920. doi: 10.1136/bmj.l920.

引用本文的文献

1
Good Ethical and Laboratory Practices for Wastewater Surveillance.废水监测的良好伦理与实验室规范。
Water Environ Res. 2025 Jun;97(6):e70112. doi: 10.1002/wer.70112.
2
Ethical, Legal, and Social Implications of Gene-Environment Interaction Research.基因-环境相互作用研究的伦理、法律和社会影响
Genet Epidemiol. 2025 Jan;49(1):e22591. doi: 10.1002/gepi.22591. Epub 2024 Sep 24.
3
Gene-environment interactions within a precision environmental health framework.在精准环境健康框架内的基因-环境相互作用。

本文引用的文献

1
Estimating the success of re-identifications in incomplete datasets using generative models.利用生成模型估计不完全数据集重识别的成功率。
Nat Commun. 2019 Jul 23;10(1):3069. doi: 10.1038/s41467-019-10933-3.
2
Re-identification Risks in HIPAA Safe Harbor Data: A study of data from one environmental health study.《健康保险流通与责任法案》安全港数据中的重新识别风险:一项对来自一项环境卫生研究数据的研究
Technol Sci. 2017;2017. Epub 2017 Aug 28.
3
Identity inference of genomic data using long-range familial searches.利用远程家族搜索推断基因组数据的身份信息。
Cell Genom. 2024 Jul 10;4(7):100591. doi: 10.1016/j.xgen.2024.100591. Epub 2024 Jun 25.
4
The Anonymous Data Warehouse: A Hands-On Framework for Anonymizing Data From Digital Health Applications.匿名数据仓库:一个用于对数字健康应用程序中的数据进行匿名化处理的实用框架。
Cureus. 2024 Apr 3;16(4):e57519. doi: 10.7759/cureus.57519. eCollection 2024 Apr.
5
Moving Forward with Reporting Back Individual Environmental Health Research Results.推进个体环境健康研究成果报告
Environ Health Perspect. 2023 Dec;131(12):125002. doi: 10.1289/EHP12463. Epub 2023 Dec 14.
6
Considerations for an integrated population health databank in Africa: lessons from global best practices.非洲综合人口健康数据库的考量:全球最佳实践的经验教训
Wellcome Open Res. 2021 Aug 23;6:214. doi: 10.12688/wellcomeopenres.17000.1. eCollection 2021.
7
Identifying US County-level characteristics associated with high COVID-19 burden.识别与 COVID-19 负担高相关的美国县级特征。
BMC Public Health. 2021 May 28;21(1):1007. doi: 10.1186/s12889-021-11060-9.
8
The Environmental Protection Agency's "Strengthening Transparency in Pivotal Science" Rule: Don't Let History Repeat Itself.美国环境保护局的“加强关键科学的透明度”规则:不要让历史重演。
Ann Am Thorac Soc. 2021 Oct;18(10):1614-1617. doi: 10.1513/AnnalsATS.202103-259VP.
Science. 2018 Nov 9;362(6415):690-694. doi: 10.1126/science.aau4832. Epub 2018 Oct 11.
4
"Transparency" as Mask? The EPA's Proposed Rule on Scientific Data.以“透明度”为幌子?美国环境保护局关于科学数据的拟议规则。
N Engl J Med. 2018 Oct 18;379(16):1496-1497. doi: 10.1056/NEJMp1807751. Epub 2018 Aug 29.
5
Re-identification of individuals in genomic data-sharing beacons via allele inference.通过等位基因推断,在基因组数据共享信标中重新识别个人。
Bioinformatics. 2019 Feb 1;35(3):365-371. doi: 10.1093/bioinformatics/bty643.
6
The Sister Study Cohort: Baseline Methods and Participant Characteristics.姐妹研究队列:基线方法和参与者特征。
Environ Health Perspect. 2017 Dec 20;125(12):127003. doi: 10.1289/EHP1923.
7
Chemical exposures in recently renovated low-income housing: Influence of building materials and occupant activities.最近翻新的低收入住房中的化学物质暴露:建筑材料和居住者活动的影响。
Environ Int. 2017 Dec;109:114-127. doi: 10.1016/j.envint.2017.07.007. Epub 2017 Sep 12.
8
The GuLF STUDY: A Prospective Study of Persons Involved in the Oil Spill Response and Clean-Up.海湾研究:一项针对参与石油泄漏应对与清理工作的人员的前瞻性研究。
Environ Health Perspect. 2017 Apr;125(4):570-578. doi: 10.1289/EHP715. Epub 2017 Mar 31.
9
2,4-Dichlorophenoxyacetic acid and non-Hodgkin's lymphoma: results from the Agricultural Health Study and an updated meta-analysis.2,4-二氯苯氧乙酸与非霍奇金淋巴瘤:农业健康研究结果及一项更新的荟萃分析
Ann Epidemiol. 2017 Apr;27(4):290-292.e5. doi: 10.1016/j.annepidem.2017.01.008. Epub 2017 Feb 22.
10
Public Attitudes toward Consent and Data Sharing in Biobank Research: A Large Multi-site Experimental Survey in the US.公众对生物样本库研究中的知情同意和数据共享的态度:美国一项大型多地点实验性调查
Am J Hum Genet. 2017 Mar 2;100(3):414-427. doi: 10.1016/j.ajhg.2017.01.021. Epub 2017 Feb 9.