• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于保护隐私的临床风险预测的合成数据。

Synthetic data for privacy-preserving clinical risk prediction.

机构信息

University of Cambridge, Cambridge, CB2 1TN, UK.

University College London, London, WC1E 6BT, UK.

出版信息

Sci Rep. 2024 Oct 27;14(1):25676. doi: 10.1038/s41598-024-72894-y.

DOI:10.1038/s41598-024-72894-y
PMID:39463411
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11514179/
Abstract

Synthetic data promise privacy-preserving data sharing for healthcare research and development. Compared with other privacy-enhancing approaches-such as federated learning-analyses performed on synthetic data can be applied downstream without modification, such that synthetic data can act in place of real data for a wide range of use cases. However, the role that synthetic data might play in all aspects of clinical model development remains unknown. In this work, we used state-of-the-art generators explicitly designed for privacy preservation to create a synthetic version of ever-smokers in the UK Biobank before building prognostic models for lung cancer under several data release assumptions. We demonstrate that synthetic data can be effectively used throughout the medical prognostic modeling pipeline even without eventual access to the real data. Furthermore, we show the implications of different data release approaches on how synthetic biobank data could be deployed within the healthcare system.

摘要

合成数据有望为医疗保健的研究和开发提供隐私保护的数据共享。与其他增强隐私的方法(例如联邦学习)相比,在合成数据上执行的分析可以无需修改即可应用于下游,从而使得合成数据可以替代真实数据用于广泛的用例。然而,合成数据在临床模型开发的各个方面可能发挥的作用尚不清楚。在这项工作中,我们使用专门为保护隐私而设计的最先进的生成器,在根据几种数据发布假设构建肺癌预后模型之前,在 UK Biobank 中创建了一个曾吸烟者的合成版本。我们证明,即使最终无法访问真实数据,合成数据也可以在整个医疗预后建模管道中有效地使用。此外,我们展示了不同数据发布方法对如何在医疗保健系统中部署合成生物库数据的影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772d/11514179/c9dc1a7eb6d4/41598_2024_72894_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772d/11514179/805817531ef5/41598_2024_72894_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772d/11514179/dfa51dec316c/41598_2024_72894_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772d/11514179/c9dc1a7eb6d4/41598_2024_72894_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772d/11514179/805817531ef5/41598_2024_72894_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772d/11514179/dfa51dec316c/41598_2024_72894_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/772d/11514179/c9dc1a7eb6d4/41598_2024_72894_Fig3_HTML.jpg

相似文献

1
Synthetic data for privacy-preserving clinical risk prediction.用于保护隐私的临床风险预测的合成数据。
Sci Rep. 2024 Oct 27;14(1):25676. doi: 10.1038/s41598-024-72894-y.
2
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
3
Sexual Harassment and Prevention Training性骚扰与预防培训
4
Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能?开发一种互联网应用算法。
Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.
5
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
6
Privacy-Preserving Glycemic Management in Type 1 Diabetes: Development and Validation of a Multiobjective Federated Reinforcement Learning Framework.1型糖尿病中保护隐私的血糖管理:多目标联邦强化学习框架的开发与验证
JMIR Diabetes. 2025 Jul 4;10:e72874. doi: 10.2196/72874.
7
AI-based Hepatic Steatosis Detection and Integrated Hepatic Assessment from Cardiac CT Attenuation Scans Enhances All-cause Mortality Risk Stratification: A Multi-center Study.基于人工智能的心脏CT衰减扫描检测肝脂肪变性及综合肝脏评估可增强全因死亡风险分层:一项多中心研究
medRxiv. 2025 Jun 11:2025.06.09.25329157. doi: 10.1101/2025.06.09.25329157.
8
A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。
Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.
9
Gaps in Artificial Intelligence Research for Rural Health in the United States: A Scoping Review.美国农村卫生人工智能研究的差距:一项范围综述
medRxiv. 2025 Jun 27:2025.06.26.25330361. doi: 10.1101/2025.06.26.25330361.
10
Assessing the comparative effects of interventions in COPD: a tutorial on network meta-analysis for clinicians.评估慢性阻塞性肺疾病干预措施的比较效果:面向临床医生的网状Meta分析教程
Respir Res. 2024 Dec 21;25(1):438. doi: 10.1186/s12931-024-03056-x.

引用本文的文献

1
Explanation and elaboration of MedinAI: guidelines for reporting artificial intelligence studies in medicines, pharmacotherapy, and pharmaceutical services.MedinAI的解释与阐述:药物、药物治疗及药学服务中人工智能研究的报告指南
Int J Clin Pharm. 2025 Apr 18. doi: 10.1007/s11096-025-01906-2.
2
Synthetic healthcare data utility with biometric pattern recognition using adversarial networks.使用对抗网络进行生物特征模式识别的合成医疗数据实用程序。
Sci Rep. 2025 Mar 21;15(1):9753. doi: 10.1038/s41598-025-94572-3.
3
AI-driven synthetic data generation for accelerating hepatology research: A study of the United Network for Organ Sharing (UNOS) database.

本文引用的文献

1
Synthetic data in medical research.医学研究中的合成数据。
BMJ Med. 2022 Sep 26;1(1):e000167. doi: 10.1136/bmjmed-2022-000167. eCollection 2022.
2
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models.深度生成模型:VAE、GAN、归一化流、基于能量和自回归模型的比较综述。
IEEE Trans Pattern Anal Mach Intell. 2022 Nov;44(11):7327-7347. doi: 10.1109/TPAMI.2021.3116668. Epub 2022 Oct 4.
3
A guide to machine learning for biologists.生物学机器学习指南。
人工智能驱动的合成数据生成以加速肝病学研究:器官共享联合网络(UNOS)数据库研究
Hepatology. 2025 Mar 11. doi: 10.1097/HEP.0000000000001299.
4
Embracing Generative Artificial Intelligence in Clinical Research and Beyond: Opportunities, Challenges, and Solutions.在临床研究及其他领域采用生成式人工智能:机遇、挑战与解决方案
JACC Adv. 2025 Mar;4(3):101593. doi: 10.1016/j.jacadv.2025.101593. Epub 2025 Feb 8.
5
CHeart: A Conditional Spatio-Temporal Generative Model for Cardiac Anatomy.CHeart:用于心脏解剖的条件时空生成模型。
IEEE Trans Med Imaging. 2024 Mar;43(3):1259-1269. doi: 10.1109/TMI.2023.3331982. Epub 2024 Mar 5.
Nat Rev Mol Cell Biol. 2022 Jan;23(1):40-55. doi: 10.1038/s41580-021-00407-0. Epub 2021 Sep 13.
4
Generating high-fidelity synthetic patient data for assessing machine learning healthcare software.生成用于评估机器学习医疗软件的高保真合成患者数据。
NPJ Digit Med. 2020 Nov 9;3(1):147. doi: 10.1038/s41746-020-00353-9.
5
Risk-Based lung cancer screening: A systematic review.基于风险的肺癌筛查:一项系统综述。
Lung Cancer. 2020 Sep;147:154-186. doi: 10.1016/j.lungcan.2020.07.007. Epub 2020 Jul 12.
6
Generation and evaluation of synthetic patient data.生成和评估合成患者数据。
BMC Med Res Methodol. 2020 May 7;20(1):108. doi: 10.1186/s12874-020-00977-1.
7
Anonymization Through Data Synthesis Using Generative Adversarial Networks (ADS-GAN).基于生成对抗网络的数据合成匿名化(ADS-GAN)。
IEEE J Biomed Health Inform. 2020 Aug;24(8):2378-2388. doi: 10.1109/JBHI.2020.2980262. Epub 2020 Mar 12.
8
Data must be shared-also with researchers outside of Europe.数据必须共享,也包括与欧洲以外的研究人员共享。
Lancet. 2019 Nov 23;394(10212):1902-1903. doi: 10.1016/S0140-6736(19)32633-9. Epub 2019 Nov 7.
9
Sharing data safely while preserving privacy.在保护隐私的同时安全地共享数据。
Lancet. 2019 Nov 23;394(10212):1902. doi: 10.1016/S0140-6736(19)32603-0. Epub 2019 Nov 7.
10
Are Requirements to Deposit Data in Research Repositories Compatible With the European Union's General Data Protection Regulation?向研究数据仓储机构提交数据的要求是否符合欧盟的《通用数据保护条例》?
Ann Intern Med. 2019 Mar 5;170(5):332-334. doi: 10.7326/M18-2854. Epub 2019 Feb 12.