South African National Bioinformatics Institute, University of the Western Cape, Bellville, South Africa.
Provincial Health Data Centre, Health Intelligence Directorate, Western Cape Department of Health and Wellness, Cape Town, Western Cape, South Africa.
BMJ Glob Health. 2023 Oct;8(10). doi: 10.1136/bmjgh-2023-013092.
Evidence-based healthcare relies on health data from diverse sources to inform decision-making across different domains, including disease prevention, aetiology, diagnostics, therapeutics and prognosis. Increasing volumes of highly granular data provide opportunities to leverage the evidence base, with growing recognition that health data are highly sensitive and onward research use may create privacy issues for individuals providing data. Concerns are heightened for data without explicit informed consent for secondary research use. Additionally, researchers-especially from under-resourced environments and the global South-may wish to participate in onward analysis of resources they collected or retain oversight of onward use to ensure ethical constraints are respected. Different data-sharing approaches may be adopted according to data sensitivity and secondary use restrictions, moving beyond the traditional Open Access model of unidirectional data transfer from generator to secondary user. We describe collaborative data sharing, facilitating research by combining datasets and undertaking meta-analysis involving collaborating partners; federated data analysis, where partners undertake synchronous, harmonised analyses on their independent datasets and then combine their results in a coauthored report, and trusted research environments where data are analysed in a controlled environment and only aggregate results are exported. We review how deidentification and anonymisation methods, including data perturbation, can reduce risks specifically associated with health data secondary use. In addition, we present an innovative modularised approach for building data sharing agreements incorporating a more nuanced approach to data sharing to protect privacy, and provide a framework for building the agreements for each of these data-sharing scenarios.
循证医疗依赖于来自不同来源的健康数据,以在包括疾病预防、病因学、诊断、治疗和预后等不同领域提供决策依据。越来越多的高度细化数据提供了利用证据基础的机会,人们越来越认识到健康数据高度敏感,并且对数据的进一步研究使用可能会给提供数据的个人带来隐私问题。对于没有明确同意进行二次研究使用的数据,人们的担忧更为强烈。此外,研究人员——特别是来自资源匮乏环境和全球南方的研究人员——可能希望参与他们收集的数据的进一步分析,或者保留对进一步使用的监督,以确保遵守伦理约束。根据数据敏感性和二次使用限制,可以采用不同的数据共享方法,超越传统的开放获取模型,即从生成者到二次用户的单向数据传输。我们描述了协作数据共享,通过合并数据集和进行涉及合作伙伴的荟萃分析来促进研究;联合数据分析,合作伙伴在独立数据集上进行同步、协调的分析,然后在合著报告中合并他们的结果;以及受信任的研究环境,在这种环境中,可以在受控环境中分析数据,并且只导出汇总结果。我们审查了去识别和匿名化方法(包括数据干扰)如何降低与健康数据二次使用特别相关的风险。此外,我们提出了一种创新的模块化方法来构建数据共享协议,该方法采用更细致的方法来共享数据以保护隐私,并为这三种数据共享场景中的每一种提供了构建协议的框架。