Suppr超能文献

负责任的数据共享:识别并纠正人类受试者可能被重新识别的情况。

Responsible data sharing: Identifying and remedying possible re-identification of human participants.

作者信息

Morehouse Kirsten N, Kurdi Benedek, Nosek Brian A

机构信息

Department of Psychology, Harvard University.

Department of Psychology, University of Illinois Urbana-Champaign.

出版信息

Am Psychol. 2024 May 6. doi: 10.1037/amp0001346.

Abstract

Open data collected from research participants creates a tension between scholarly values of transparency and sharing, on the one hand, and privacy and security, on the other hand. A common solution is to make data sets anonymous by removing personally identifying information (e.g., names or worker IDs) before sharing. However, ostensibly anonymized data sets may be at risk of if they include demographic information. In the present article, we provide researchers with broadly applicable guidance and tangible tools so that they can engage in open science practices without jeopardizing participants' privacy. Specifically, we (a) review current privacy standards, (b) describe computer science data protection frameworks and their adaptability to the social sciences, (c) provide practical guidance for assessing and addressing re-identification risk, (d) introduce two open-source algorithms developed for psychological scientists-MinBlur and MinBlurLite-to increase privacy while maintaining the integrity of open data, and (e) highlight aspects of ethical data sharing that require further attention. Ultimately, the risk of re-identification should not dissuade engagement with open science practices. Instead, technical innovations should be developed and harnessed so that science can be as open as possible to promote transparency and sharing and as closed as necessary to maintain privacy and security. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

摘要

从研究参与者那里收集的开放数据,一方面在透明度和共享的学术价值与另一方面的隐私和安全之间制造了一种紧张关系。一种常见的解决办法是在共享之前通过去除个人身份识别信息(如姓名或工号)使数据集匿名化。然而,如果表面上匿名的数据集包含人口统计信息,那么它们可能会面临[此处原文缺失相关内容]的风险。在本文中,我们为研究人员提供广泛适用的指导和切实可行的工具,以便他们能够在不危及参与者隐私的情况下开展开放科学实践。具体而言,我们(a)回顾当前的隐私标准,(b)描述计算机科学数据保护框架及其对社会科学的适应性,(c)提供评估和应对重新识别风险的实用指导,(d)介绍为心理科学家开发的两种开源算法——MinBlur和MinBlurLite——以在保持开放数据完整性的同时提高隐私性,以及(e)强调需要进一步关注的道德数据共享方面。最终,重新识别的风险不应阻碍参与开放科学实践。相反,应该开发和利用技术创新,以便科学能够尽可能开放以促进透明度和共享,并在必要时尽可能封闭以维护隐私和安全。(PsycInfo数据库记录(c)2025美国心理学会,保留所有权利)

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验