Suppr超能文献

探索合成数据在阿片类药物滥用分析的无监督聚类中的应用。

Exploring the Utilization of Synthetic Data in Unsupervised Clustering for Opioid Misuse Analysis.

作者信息

Zhang Yili, Dong Jia Li, Xue Bai, Xiong Yanbao, Gupta Samir, Segbroeck Maarten Van, Shara Nawar, McGarvey Peter

机构信息

Innovation Center for Biomedical Informatics, Georgetown University, Washington, DC.

Department of Computer Science, Yale University, New Haven, CT.

出版信息

AMIA Annu Symp Proc. 2025 May 22;2024:1313-1322. eCollection 2024.

Abstract

Privacy and security restrictions on medical data pose challenges to collaborative research, making synthetic data an increasingly attractive solution. Recent advancements in Generative AI technologies, like GAN models, have improved synthetic data generation. This study investigates the use of synthetic data in clustering models for opioid misuse analysis, generating a dataset that replicates real-world data from 2017 to 2019, including demographics and diagnosis codes. By maintaining patient privacy, we enable comprehensive analysis without compromising security. We developed unsupervised clustering models to identify opioid misuse patterns and assessed the effectiveness of synthetic data across four scenarios: training on real dataset and testing on real dataset (TRTR), training on real dataset and testing on synthetic dataset (TRTS), TSTR, and TSTS. Results demonstrate that synthetic data can replicate real data distributions and clustering characteristics as a training set, offering significant potential for collaborative model development and optimization without exposing privacy or security risks.

摘要

医疗数据的隐私和安全限制给合作研究带来了挑战,使得合成数据成为一种越来越有吸引力的解决方案。生成式人工智能技术(如GAN模型)的最新进展改进了合成数据的生成。本研究调查了合成数据在阿片类药物滥用分析聚类模型中的应用,生成了一个复制2017年至2019年真实世界数据的数据集,包括人口统计学和诊断代码。通过维护患者隐私,我们能够在不损害安全性的情况下进行全面分析。我们开发了无监督聚类模型来识别阿片类药物滥用模式,并在四种情况下评估了合成数据的有效性:在真实数据集上训练并在真实数据集上测试(TRTR)、在真实数据集上训练并在合成数据集上测试(TRTS)、TSTR和TSTS。结果表明,合成数据作为训练集可以复制真实数据分布和聚类特征,为合作模型开发和优化提供了巨大潜力,同时不会暴露隐私或安全风险。

相似文献

1
Exploring the Utilization of Synthetic Data in Unsupervised Clustering for Opioid Misuse Analysis.
AMIA Annu Symp Proc. 2025 May 22;2024:1313-1322. eCollection 2024.
2
The urgent need to accelerate synthetic data privacy frameworks for medical research.
Lancet Digit Health. 2025 Feb;7(2):e157-e160. doi: 10.1016/S2589-7500(24)00196-1. Epub 2024 Nov 26.
3
Generative artificial intelligence to produce high-fidelity blastocyst-stage embryo images.
Hum Reprod. 2024 Jun 3;39(6):1197-1207. doi: 10.1093/humrep/deae064.
5
Augmenting a spine CT scans dataset using VAEs, GANs, and transfer learning for improved detection of vertebral compression fractures.
Comput Biol Med. 2025 Jan;184:109446. doi: 10.1016/j.compbiomed.2024.109446. Epub 2024 Nov 16.
6
Tunable Privacy Risk Evaluation of Generative Adversarial Networks.
Stud Health Technol Inform. 2024 Aug 22;316:1233-1237. doi: 10.3233/SHTI240634.
8
Generating synthetic personal health data using conditional generative adversarial networks combining with differential privacy.
J Biomed Inform. 2023 Jul;143:104404. doi: 10.1016/j.jbi.2023.104404. Epub 2023 Jun 1.
10
Generative AI in Medical Practice: In-Depth Exploration of Privacy and Security Challenges.
J Med Internet Res. 2024 Mar 8;26:e53008. doi: 10.2196/53008.

本文引用的文献

1
Generating synthetic data from administrative health records for drug safety and effectiveness studies.
Int J Popul Data Sci. 2023 Nov 27;8(1):2176. doi: 10.23889/ijpds.v8i1.2176. eCollection 2023.
3
Harnessing the power of synthetic data in healthcare: innovation, application, and privacy.
NPJ Digit Med. 2023 Oct 9;6(1):186. doi: 10.1038/s41746-023-00927-3.
5
Opioid use and opioid use disorder in mono and dual-system users of veteran affairs medical centers.
Front Public Health. 2023 Apr 4;11:1148189. doi: 10.3389/fpubh.2023.1148189. eCollection 2023.
6
Synthetic data in health care: A narrative review.
PLOS Digit Health. 2023 Jan 6;2(1):e0000082. doi: 10.1371/journal.pdig.0000082. eCollection 2023 Jan.
7
A Multifaceted benchmarking of synthetic electronic health record generation models.
Nat Commun. 2022 Dec 9;13(1):7609. doi: 10.1038/s41467-022-35295-1.
9
Measuring prescription opioid misuse and its consequences.
Br J Clin Pharmacol. 2021 Apr;87(4):1647-1653. doi: 10.1111/bcp.14791. Epub 2021 Mar 8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验