合成健康数据：真实的伦理承诺与危险。

Synthetic Health Data: Real Ethical Promise and Peril.

出版信息

Hastings Cent Rep. 2024 Sep;54(5):8-13. doi: 10.1002/hast.4911.

DOI:10.1002/hast.4911

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11555762/

Abstract

Researchers and practitioners are increasingly using machine-generated synthetic data as a tool for advancing health science and practice, by expanding access to health data while-potentially-mitigating privacy and related ethical concerns around data sharing. While using synthetic data in this way holds promise, we argue that it also raises significant ethical, legal, and policy concerns, including persistent privacy and security problems, accuracy and reliability issues, worries about fairness and bias, and new regulatory challenges. The virtue of synthetic data is often understood to be its detachment from the data subjects whose measurement data is used to generate it. However, we argue that addressing the ethical issues synthetic data raises might require bringing data subjects back into the picture, finding ways that researchers and data subjects can be more meaningfully engaged in the construction and evaluation of datasets and in the creation of institutional safeguards that promote responsible use.

摘要

研究人员和从业者越来越多地使用机器生成的合成数据作为推进健康科学和实践的工具，通过扩大对健康数据的获取，同时潜在地减轻数据共享方面的隐私和相关伦理问题。虽然以这种方式使用合成数据具有很大的前景，但我们认为它也引发了重大的伦理、法律和政策问题，包括持续存在的隐私和安全问题、准确性和可靠性问题、对公平和偏见的担忧，以及新的监管挑战。合成数据的优点通常被理解为它与数据主体的分离，数据主体的测量数据被用来生成它。然而，我们认为，解决合成数据引发的伦理问题可能需要将数据主体重新纳入考虑范围，寻找让研究人员和数据主体能够更有意义地参与数据集的构建和评估，以及创建促进负责任使用的机构保障措施的方法。

相似文献

1

Synthetic Health Data: Real Ethical Promise and Peril.合成健康数据：真实的伦理承诺与危险。

Hastings Cent Rep. 2024 Sep;54(5):8-13. doi: 10.1002/hast.4911.

2

Revolutionizing Medical Data Sharing Using Advanced Privacy-Enhancing Technologies: Technical, Legal, and Ethical Synthesis.利用先进的隐私增强技术实现医学数据共享的革命：技术、法律和伦理综合。

J Med Internet Res. 2021 Feb 25;23(2):e25120. doi: 10.2196/25120.

3

Navigating ethical quandaries with the privacy dilemma of biomedical datasets.生物医学数据集的隐私困境中的伦理困境

Pac Symp Biocomput. 2020;25:736-738.

4

Privacy-Enhancing Technologies in Biomedical Data Science.生物医学数据科学中的隐私增强技术。

Annu Rev Biomed Data Sci. 2024 Aug;7(1):317-343. doi: 10.1146/annurev-biodatasci-120423-120107.

5

What Do We Mean by Sharing of Patient Data? DaSH: A Data Sharing Hierarchy of Privacy and Ethical Challenges.我们所说的患者数据共享是什么意思？DaSH：数据共享的隐私和伦理挑战层次结构。

Appl Clin Inform. 2024 Oct;15(5):833-841. doi: 10.1055/a-2373-3291. Epub 2024 Jul 25.

6

Ethical concerns on sharing genomic data including patients' family members.关于共享包括患者家庭成员在内的基因组数据的伦理问题。

BMC Med Ethics. 2018 Jun 18;19(1):61. doi: 10.1186/s12910-018-0310-5.

7

A software platform to analyse the ethical issues of electronic patient privacy policy: the S3P example.一个用于分析电子患者隐私政策伦理问题的软件平台：以S3P为例。

J Med Ethics. 2007 Dec;33(12):695-8. doi: 10.1136/jme.2006.018473.

8

Ethical and legal implications of whole genome and whole exome sequencing in African populations.全基因组和全外显子组测序在非裔人群中的伦理和法律影响。

BMC Med Ethics. 2013 May 28;14:21. doi: 10.1186/1472-6939-14-21.

9

Ethical considerations for artificial intelligence in dermatology: a scoping review.人工智能在皮肤科应用的伦理考量：范围综述。

Br J Dermatol. 2024 May 17;190(6):789-797. doi: 10.1093/bjd/ljae040.

10

Consent and confidentiality in the light of recent demands for data sharing.鉴于近期对数据共享的要求，谈谈同意与保密问题。

Biom J. 2017 Mar;59(2):240-250. doi: 10.1002/bimj.201500044. Epub 2016 Feb 3.

引用本文的文献

1

The current status and future directions of artificial intelligence in the prediction, diagnosis, and treatment of liver diseases.人工智能在肝脏疾病预测、诊断及治疗中的现状与未来方向

Digit Health. 2025 Apr 13;11:20552076251325418. doi: 10.1177/20552076251325418. eCollection 2025 Jan-Dec.

2

Synthetic data and ELSI-focused computational checklists-A survey of biomedical professionals' views.合成数据与聚焦伦理、法律和社会影响的计算清单——生物医学专业人员观点调查

PLOS Digit Health. 2024 Nov 20;3(11):e0000666. doi: 10.1371/journal.pdig.0000666. eCollection 2024 Nov.

本文引用的文献

1

Harnessing the power of synthetic data in healthcare: innovation, application, and privacy.利用合成数据在医疗保健领域的力量：创新、应用与隐私。

NPJ Digit Med. 2023 Oct 9;6(1):186. doi: 10.1038/s41746-023-00927-3.

2

Synthesize high-dimensional longitudinal electronic health records via hierarchical autoregressive language model.通过层次自回归语言模型合成高维纵向电子健康记录。

Nat Commun. 2023 Aug 31;14(1):5305. doi: 10.1038/s41467-023-41093-0.

3

Synthetic data could be better than real data.合成数据可能比真实数据更好。

Nature. 2023 Apr 27. doi: 10.1038/d41586-023-01445-8.

4

Young people's data governance preferences for their mental health data: MindKind Study findings from India, South Africa, and the United Kingdom.年轻人对其心理健康数据的治理偏好：来自印度、南非和英国的 MindKind 研究结果。

PLoS One. 2023 Apr 19;18(4):e0279857. doi: 10.1371/journal.pone.0279857. eCollection 2023.

5

Synthetic data in health care: A narrative review.医疗保健中的合成数据：一篇叙述性综述。

PLOS Digit Health. 2023 Jan 6;2(1):e0000082. doi: 10.1371/journal.pdig.0000082. eCollection 2023 Jan.

6

A Multifaceted benchmarking of synthetic electronic health record generation models.综合电子健康记录生成模型的多方面基准测试。

Nat Commun. 2022 Dec 9;13(1):7609. doi: 10.1038/s41467-022-35295-1.

7

DataSifterText: Partially Synthetic Text Generation for Sensitive Clinical Notes.DataSifterText：用于敏感临床记录的部分合成文本生成。

J Med Syst. 2022 Nov 16;46(12):96. doi: 10.1007/s10916-022-01880-6.

8

Shifting machine learning for healthcare from development to deployment and from models to data.将医疗保健领域的机器学习从开发转移到部署，从模型转移到数据。

Nat Biomed Eng. 2022 Dec;6(12):1330-1345. doi: 10.1038/s41551-022-00898-y. Epub 2022 Jul 4.

9

Synthetic patient data in health care: a widening legal loophole.医疗保健领域的合成患者数据：一个不断扩大的法律漏洞。

Lancet. 2022 Apr 23;399(10335):1601-1602. doi: 10.1016/S0140-6736(22)00232-X. Epub 2022 Mar 28.

10

The Problem of Fairness in Synthetic Healthcare Data.合成医疗数据中的公平性问题。

Entropy (Basel). 2021 Sep 4;23(9):1165. doi: 10.3390/e23091165.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验