Suppr超能文献

通过用于合成健康数据生成的信息理论框架,在尊重隐私的同时保存信息。

Preserving information while respecting privacy through an information theoretic framework for synthetic health data generation.

作者信息

Sella Nadir, Guinot Florent, Lagrange Nikita, Albou Laurent-Philippe, Desponds Jonathan, Isambert Hervé

机构信息

Institut Roche, Boulogne-Billancourt, France.

Institut Curie, CNRS UMR168, PSL University, Sorbonne University, Paris, 75005, France.

出版信息

NPJ Digit Med. 2025 Jan 23;8(1):49. doi: 10.1038/s41746-025-01431-6.

Abstract

Generating synthetic data from medical records is a complex task intensified by patient privacy concerns. In recent years, multiple approaches have been reported for the generation of synthetic data, however, limited attention was given to jointly evaluate the quality and the privacy of the generated data. The quality and privacy of synthetic data stem from multivariate associations across variables, which cannot be assessed by comparing univariate distributions with the original data. Here, we introduce a novel algorithm (MIIC-SDG) for generating synthetic data from electronic records based on a multivariate information framework and Bayesian network theory. We also propose a new metric to quantitatively assess the trade-off between the Quality and Privacy Scores (QPS) of synthetic data generation methods. The performance of MIIC-SDG is demonstrated on different clinical datasets and favorably compares with state-of-the-art synthetic data generation methods, based on the QPS trade-off between several quality and privacy metrics.

摘要

从医疗记录中生成合成数据是一项因患者隐私问题而变得复杂的任务。近年来,已有多种生成合成数据的方法被报道,然而,对于联合评估生成数据的质量和隐私问题却关注有限。合成数据的质量和隐私源于变量间的多变量关联,而这无法通过将单变量分布与原始数据进行比较来评估。在此,我们基于多变量信息框架和贝叶斯网络理论,引入了一种从电子记录中生成合成数据的新算法(MIIC-SDG)。我们还提出了一种新指标,用于定量评估合成数据生成方法的质量与隐私分数(QPS)之间的权衡。基于几个质量和隐私指标之间的QPS权衡,MIIC-SDG的性能在不同临床数据集上得到了验证,并且与最先进的合成数据生成方法相比表现良好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/08fe/11754479/b9595ace9806/41746_2025_1431_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验