Jwa Anita S, Koyejo Oluwasanmi, Poldrack Russell A
Department of Psychology, Stanford University, Stanford, CA, United States.
Computer Science Department, Stanford University, Stanford, CA, United States.
Imaging Neurosci (Camb). 2024 Mar 22;2. doi: 10.1162/imag_a_00111. eCollection 2024.
Sharing research data has been widely promoted in the field of neuroimaging and has enhanced the rigor and reproducibility of neuroimaging studies. Yet the emergence of novel software tools and algorithms, such as face recognition, has raised concerns due to their potential to reidentify defaced neuroimaging data that are thought to have been deidentified. Despite the surge of privacy concerns, however, the risk of reidentification via these tools and algorithms has not yet been examined outside the limited settings for demonstration purposes. There is also a pressing need to carefully analyze regulatory implications of this new reidentification attack because concerns about the anonymity of data are the main reason that researchers think they are legally constrained from sharing their data. This study aims to tackle these gaps through rigorous technical and regulatory analyses. Using a simulation analysis, we first tested the generalizability of the matching accuracies in defaced neuroimaging data reported in a recent face recognition study (Schwarz et al., 2021). The results showed that the real-world likelihood of reidentification in defaced neuroimaging data via face recognition would be substantially lower than that reported in the previous studies. Next, by taking a US jurisdiction as a case study, we analyzed whether the novel reidentification threat posed by face recognition would place defaced neuroimaging data out of compliance under the current regulatory regime. Our analysis suggests that defaced neuroimaging data using existing tools would still meet the regulatory requirements for data deidentification. A brief comparison with the EU's General Data Protection Regulation (GDPR) was also provided. Then, we examined the implication of NIH's new Data Management and Sharing Policy on the current practice of neuroimaging data sharing based on the results of our simulation and regulatory analyses. Finally, we discussed future directions of open data sharing in neuroimaging.
在神经影像学领域,共享研究数据得到了广泛推广,提高了神经影像学研究的严谨性和可重复性。然而,诸如人脸识别等新型软件工具和算法的出现引发了人们的担忧,因为它们有可能重新识别那些被认为已去识别化的面部模糊神经影像数据。尽管隐私担忧激增,但除了有限的用于演示目的的场景外,尚未对通过这些工具和算法进行重新识别的风险进行研究。此外,迫切需要仔细分析这种新的重新识别攻击的监管影响,因为对数据匿名性的担忧是研究人员认为自己在法律上受到限制而无法共享数据的主要原因。本研究旨在通过严谨的技术和监管分析来填补这些空白。通过模拟分析,我们首先测试了最近一项人脸识别研究(施瓦茨等人,2021年)中报告的面部模糊神经影像数据匹配准确率的普遍性。结果表明,通过人脸识别在面部模糊神经影像数据中进行重新识别的现实可能性将大大低于先前研究报告的可能性。接下来,以美国司法管辖区为例进行研究,我们分析了人脸识别带来的新型重新识别威胁是否会使面部模糊神经影像数据在当前监管制度下不符合规定。我们的分析表明,使用现有工具处理的面部模糊神经影像数据仍将符合数据去识别化的监管要求。我们还简要比较了欧盟的《通用数据保护条例》(GDPR)。然后,我们根据模拟和监管分析结果,研究了美国国立卫生研究院新的数据管理和共享政策对当前神经影像数据共享实践的影响。最后,我们讨论了神经影像学开放数据共享的未来方向。