• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用真实世界时间序列生成对抗网络对组合时间序列和静态医学数据进行合成和质量评估。

Synthesis and quality assessment of combined time-series and static medical data using a real-world time-series generative adversarial network.

机构信息

Department of Digital Health, Samsung Advanced Institute for Health Sciences and Technology, Sungkyunkwan University, Seoul, Republic of Korea.

Department of Radiology, Samsung Medical Center, Sungkyunkwan University, 81 Irwon-Ro, Gangnam-Gu, Seoul, 06351, Republic of Korea.

出版信息

Sci Rep. 2024 Aug 17;14(1):19064. doi: 10.1038/s41598-024-69812-7.

DOI:10.1038/s41598-024-69812-7
PMID:39154144
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11330441/
Abstract

This study addresses challenges related to privacy issues in utilizing medical data, particularly the protection of personal information. To overcome this obstacle, the research focuses on data synthesis using real-world time-series generative adversarial networks (RTSGAN). A total of 53,005 data were synthesized using the dataset of 15,799 patients with colorectal cancer. The results of the quantitative evaluation of the synthetic data's quality are as follows: the Hellinger distance ranged from 0 to 0.25; the train on synthetic, test on real (TSTR) and train on real, test on synthetic (TRTS) results showed an average area under the curve of 0.99 and 0.98; a propensity mean squared error was 0.223. The synthetic and real data were similar in the qualitative methods including t-SNE and histogram analyses. The application of synthetic data in predicting five-year survival in colorectal cancer patients demonstrates comparable performance to models based on real data. This study employs distance to closest records and membership inference test to assess potential privacy exposure, revealing minimal risk. This study demonstrated that it is feasible to synthesize medical data, including time-series data, using the RTSGAN, and the synthetic data can be evaluated to accurately reflect the characteristics of real data through quantitative and qualitative methods as well as by utilizing real-world artificial intelligence models.

摘要

本研究解决了在利用医疗数据时与隐私问题相关的挑战,特别是对个人信息的保护。为了克服这一障碍,研究重点是使用真实世界时间序列生成对抗网络(RTSGAN)进行数据综合。使用 15799 例结直肠癌患者的数据集共合成了 53005 个数据。对合成数据质量的定量评估结果如下:Hellinger 距离范围为 0 至 0.25;合成数据上训练、真实数据上测试(TSTR)和真实数据上训练、合成数据上测试(TRTS)的结果显示平均曲线下面积分别为 0.99 和 0.98;倾向均方误差为 0.223。在 t-SNE 和直方图分析等定性方法中,合成数据和真实数据相似。在预测结直肠癌患者五年生存率方面,合成数据的应用与基于真实数据的模型表现相当。本研究使用距离最近记录和成员推断测试来评估潜在的隐私风险,结果显示风险极小。本研究表明,使用 RTSGAN 对医疗数据(包括时间序列数据)进行合成是可行的,并且可以通过定量和定性方法以及利用真实世界的人工智能模型来评估合成数据,以准确反映真实数据的特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/9eb4aef9f26b/41598_2024_69812_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/3c68c3ade50b/41598_2024_69812_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/e62092cd5a26/41598_2024_69812_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/92a342a98357/41598_2024_69812_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/c789767f3ef8/41598_2024_69812_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/9eb4aef9f26b/41598_2024_69812_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/3c68c3ade50b/41598_2024_69812_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/e62092cd5a26/41598_2024_69812_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/92a342a98357/41598_2024_69812_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/c789767f3ef8/41598_2024_69812_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c678/11330441/9eb4aef9f26b/41598_2024_69812_Fig5_HTML.jpg

相似文献

1
Synthesis and quality assessment of combined time-series and static medical data using a real-world time-series generative adversarial network.使用真实世界时间序列生成对抗网络对组合时间序列和静态医学数据进行合成和质量评估。
Sci Rep. 2024 Aug 17;14(1):19064. doi: 10.1038/s41598-024-69812-7.
2
Generative artificial intelligence to produce high-fidelity blastocyst-stage embryo images.生成式人工智能生成高保真囊胚期胚胎图像。
Hum Reprod. 2024 Jun 3;39(6):1197-1207. doi: 10.1093/humrep/deae064.
3
SynthEye: Investigating the Impact of Synthetic Data on Artificial Intelligence-assisted Gene Diagnosis of Inherited Retinal Disease.SynthEye:研究合成数据对遗传性视网膜疾病人工智能辅助基因诊断的影响。
Ophthalmol Sci. 2022 Nov 22;3(2):100258. doi: 10.1016/j.xops.2022.100258. eCollection 2023 Jun.
4
Generating sequential electronic health records using dual adversarial autoencoder.使用对偶对抗自动编码器生成连续的电子健康记录。
J Am Med Inform Assoc. 2020 Jul 1;27(9):1411-1419. doi: 10.1093/jamia/ocaa119.
5
Creating High Fidelity Synthetic Pelvis Radiographs Using Generative Adversarial Networks: Unlocking the Potential of Deep Learning Models Without Patient Privacy Concerns.利用生成对抗网络生成高保真骨盆 X 射线:在不涉及患者隐私问题的情况下挖掘深度学习模型的潜力。
J Arthroplasty. 2023 Oct;38(10):2037-2043.e1. doi: 10.1016/j.arth.2022.12.013. Epub 2022 Dec 17.
6
Generating synthetic personal health data using conditional generative adversarial networks combining with differential privacy.使用条件生成对抗网络结合差分隐私生成合成个人健康数据。
J Biomed Inform. 2023 Jul;143:104404. doi: 10.1016/j.jbi.2023.104404. Epub 2023 Jun 1.
7
Synthesizing time-series wound prognosis factors from electronic medical records using generative adversarial networks.使用生成对抗网络从电子病历中综合时间序列伤口预后因素。
J Biomed Inform. 2022 Jan;125:103972. doi: 10.1016/j.jbi.2021.103972. Epub 2021 Dec 14.
8
Evaluation of Generative Adversarial Networks for High-Resolution Synthetic Image Generation of Circumpapillary Optical Coherence Tomography Images for Glaucoma.用于青光眼的周边视网膜光相干断层扫描图像高分辨率合成图像生成的生成对抗网络评估。
JAMA Ophthalmol. 2022 Oct 1;140(10):974-981. doi: 10.1001/jamaophthalmol.2022.3375.
9
Enhancing classification of cells procured from bone marrow aspirate smears using generative adversarial networks and sequential convolutional neural network.利用生成对抗网络和序列卷积神经网络增强骨髓穿刺涂片获取的细胞分类。
Comput Methods Programs Biomed. 2022 Sep;224:107019. doi: 10.1016/j.cmpb.2022.107019. Epub 2022 Jul 10.
10
DeepFake electrocardiograms using generative adversarial networks are the beginning of the end for privacy issues in medicine.使用生成对抗网络的 DeepFake 心电图是医学隐私问题终结的开始。
Sci Rep. 2021 Nov 9;11(1):21896. doi: 10.1038/s41598-021-01295-2.

本文引用的文献

1
Utility Metrics for Evaluating Synthetic Health Data Generation Methods: Validation Study.用于评估合成健康数据生成方法的效用指标:验证研究
JMIR Med Inform. 2022 Apr 7;10(4):e35734. doi: 10.2196/35734.
2
Review of Statistical Methods for Evaluating the Performance of Survival or Other Time-to-Event Prediction Models (from Conventional to Deep Learning Approaches).评价生存或其他事件时间预测模型(从传统到深度学习方法)性能的统计方法综述。
Korean J Radiol. 2021 Oct;22(10):1697-1707. doi: 10.3348/kjr.2021.0223. Epub 2021 Jul 1.
3
Holdout-Based Empirical Assessment of Mixed-Type Synthetic Data.
基于留出法的混合型合成数据实证评估。
Front Big Data. 2021 Jun 29;4:679939. doi: 10.3389/fdata.2021.679939. eCollection 2021.
4
Evaluating Identity Disclosure Risk in Fully Synthetic Health Data: Model Development and Validation.评估完全合成健康数据中的身份披露风险:模型开发与验证
J Med Internet Res. 2020 Nov 16;22(11):e23139. doi: 10.2196/23139.
5
Non-normal Distributions Commonly Used in Health, Education, and Social Sciences: A Systematic Review.健康、教育和社会科学中常用的非正态分布:一项系统综述。
Front Psychol. 2017 Sep 14;8:1602. doi: 10.3389/fpsyg.2017.01602. eCollection 2017.
6
Random Survival Forest in practice: a method for modelling complex metabolomics data in time to event analysis.实践中的随机生存森林:一种在时间-事件分析中对复杂代谢组学数据进行建模的方法。
Int J Epidemiol. 2016 Oct;45(5):1406-1420. doi: 10.1093/ije/dyw145. Epub 2016 Sep 1.
7
Introduction to machine learning: k-nearest neighbors.机器学习导论:k-最近邻算法。
Ann Transl Med. 2016 Jun;4(11):218. doi: 10.21037/atm.2016.03.37.
8
Consistent estimation of the expected Brier score in general survival models with right-censored event times.在具有右删失事件时间的一般生存模型中对预期Brier评分进行一致估计。
Biom J. 2006 Dec;48(6):1029-40. doi: 10.1002/bimj.200610301.
9
A time-dependent discrimination index for survival data.生存数据的时间依赖性判别指数。
Stat Med. 2005 Dec 30;24(24):3927-44. doi: 10.1002/sim.2427.