• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Exploring the Utilization of Synthetic Data in Unsupervised Clustering for Opioid Misuse Analysis.探索合成数据在阿片类药物滥用分析的无监督聚类中的应用。
AMIA Annu Symp Proc. 2025 May 22;2024:1313-1322. eCollection 2024.
2
The urgent need to accelerate synthetic data privacy frameworks for medical research.加速医学研究合成数据隐私框架的迫切需求。
Lancet Digit Health. 2025 Feb;7(2):e157-e160. doi: 10.1016/S2589-7500(24)00196-1. Epub 2024 Nov 26.
3
Generative artificial intelligence to produce high-fidelity blastocyst-stage embryo images.生成式人工智能生成高保真囊胚期胚胎图像。
Hum Reprod. 2024 Jun 3;39(6):1197-1207. doi: 10.1093/humrep/deae064.
4
Publicly available machine learning models for identifying opioid misuse from the clinical notes of hospitalized patients.可公开获取的机器学习模型,用于从住院患者的临床记录中识别阿片类药物滥用。
BMC Med Inform Decis Mak. 2020 Apr 29;20(1):79. doi: 10.1186/s12911-020-1099-y.
5
Augmenting a spine CT scans dataset using VAEs, GANs, and transfer learning for improved detection of vertebral compression fractures.使用变分自编码器(VAE)、生成对抗网络(GAN)和迁移学习增强脊柱CT扫描数据集以改进椎体压缩骨折的检测。
Comput Biol Med. 2025 Jan;184:109446. doi: 10.1016/j.compbiomed.2024.109446. Epub 2024 Nov 16.
6
Tunable Privacy Risk Evaluation of Generative Adversarial Networks.生成式对抗网络的可调隐私风险评估。
Stud Health Technol Inform. 2024 Aug 22;316:1233-1237. doi: 10.3233/SHTI240634.
7
Utility-based Analysis of Statistical Approaches and Deep Learning Models for Synthetic Data Generation With Focus on Correlation Structures: Algorithm Development and Validation.基于效用的统计方法和深度学习模型用于合成数据生成的分析,重点关注相关结构:算法开发与验证
JMIR AI. 2025 Mar 20;4:e65729. doi: 10.2196/65729.
8
Generating synthetic personal health data using conditional generative adversarial networks combining with differential privacy.使用条件生成对抗网络结合差分隐私生成合成个人健康数据。
J Biomed Inform. 2023 Jul;143:104404. doi: 10.1016/j.jbi.2023.104404. Epub 2023 Jun 1.
9
Synthesis and quality assessment of combined time-series and static medical data using a real-world time-series generative adversarial network.使用真实世界时间序列生成对抗网络对组合时间序列和静态医学数据进行合成和质量评估。
Sci Rep. 2024 Aug 17;14(1):19064. doi: 10.1038/s41598-024-69812-7.
10
Generative AI in Medical Practice: In-Depth Exploration of Privacy and Security Challenges.生成式人工智能在医疗实践中的应用:隐私与安全挑战的深入探讨。
J Med Internet Res. 2024 Mar 8;26:e53008. doi: 10.2196/53008.

本文引用的文献

1
Generating synthetic data from administrative health records for drug safety and effectiveness studies.从医疗健康记录中生成用于药物安全性和有效性研究的合成数据。
Int J Popul Data Sci. 2023 Nov 27;8(1):2176. doi: 10.23889/ijpds.v8i1.2176. eCollection 2023.
2
Enriching Data Science and Health Care Education: Application and Impact of Synthetic Data Sets Through the Health Gym Project.丰富数据科学和医疗保健教育:通过健康健身房项目应用和影响合成数据集。
JMIR Med Educ. 2024 Jan 16;10:e51388. doi: 10.2196/51388.
3
Harnessing the power of synthetic data in healthcare: innovation, application, and privacy.利用合成数据在医疗保健领域的力量:创新、应用与隐私。
NPJ Digit Med. 2023 Oct 9;6(1):186. doi: 10.1038/s41746-023-00927-3.
4
Synthetic Data Generation by Artificial Intelligence to Accelerate Research and Precision Medicine in Hematology.人工智能生成合成数据以加速血液学研究和精准医学
JCO Clin Cancer Inform. 2023 Jun;7:e2300021. doi: 10.1200/CCI.23.00021.
5
Opioid use and opioid use disorder in mono and dual-system users of veteran affairs medical centers.退伍军人事务医疗中心中单用和双用阿片类药物使用者的阿片类药物使用和阿片类药物使用障碍。
Front Public Health. 2023 Apr 4;11:1148189. doi: 10.3389/fpubh.2023.1148189. eCollection 2023.
6
Synthetic data in health care: A narrative review.医疗保健中的合成数据:一篇叙述性综述。
PLOS Digit Health. 2023 Jan 6;2(1):e0000082. doi: 10.1371/journal.pdig.0000082. eCollection 2023 Jan.
7
A Multifaceted benchmarking of synthetic electronic health record generation models.综合电子健康记录生成模型的多方面基准测试。
Nat Commun. 2022 Dec 9;13(1):7609. doi: 10.1038/s41467-022-35295-1.
8
Contribution of Synthetic Data Generation towards an Improved Patient Stratification in Palliative Care.合成数据生成对改善姑息治疗中患者分层的贡献。
J Pers Med. 2022 Aug 4;12(8):1278. doi: 10.3390/jpm12081278.
9
Measuring prescription opioid misuse and its consequences.测量处方阿片类药物滥用及其后果。
Br J Clin Pharmacol. 2021 Apr;87(4):1647-1653. doi: 10.1111/bcp.14791. Epub 2021 Mar 8.
10
Understanding and detecting defects in healthcare administration data: Toward higher data quality to better support healthcare operations and decisions.理解与检测医疗管理数据中的缺陷:迈向更高数据质量,以更好地支持医疗运营与决策。
J Am Med Inform Assoc. 2020 Mar 1;27(3):386-395. doi: 10.1093/jamia/ocz201.

探索合成数据在阿片类药物滥用分析的无监督聚类中的应用。

Exploring the Utilization of Synthetic Data in Unsupervised Clustering for Opioid Misuse Analysis.

作者信息

Zhang Yili, Dong Jia Li, Xue Bai, Xiong Yanbao, Gupta Samir, Segbroeck Maarten Van, Shara Nawar, McGarvey Peter

机构信息

Innovation Center for Biomedical Informatics, Georgetown University, Washington, DC.

Department of Computer Science, Yale University, New Haven, CT.

出版信息

AMIA Annu Symp Proc. 2025 May 22;2024:1313-1322. eCollection 2024.

PMID:40417526
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12099348/
Abstract

Privacy and security restrictions on medical data pose challenges to collaborative research, making synthetic data an increasingly attractive solution. Recent advancements in Generative AI technologies, like GAN models, have improved synthetic data generation. This study investigates the use of synthetic data in clustering models for opioid misuse analysis, generating a dataset that replicates real-world data from 2017 to 2019, including demographics and diagnosis codes. By maintaining patient privacy, we enable comprehensive analysis without compromising security. We developed unsupervised clustering models to identify opioid misuse patterns and assessed the effectiveness of synthetic data across four scenarios: training on real dataset and testing on real dataset (TRTR), training on real dataset and testing on synthetic dataset (TRTS), TSTR, and TSTS. Results demonstrate that synthetic data can replicate real data distributions and clustering characteristics as a training set, offering significant potential for collaborative model development and optimization without exposing privacy or security risks.

摘要

医疗数据的隐私和安全限制给合作研究带来了挑战,使得合成数据成为一种越来越有吸引力的解决方案。生成式人工智能技术(如GAN模型)的最新进展改进了合成数据的生成。本研究调查了合成数据在阿片类药物滥用分析聚类模型中的应用,生成了一个复制2017年至2019年真实世界数据的数据集,包括人口统计学和诊断代码。通过维护患者隐私,我们能够在不损害安全性的情况下进行全面分析。我们开发了无监督聚类模型来识别阿片类药物滥用模式,并在四种情况下评估了合成数据的有效性:在真实数据集上训练并在真实数据集上测试(TRTR)、在真实数据集上训练并在合成数据集上测试(TRTS)、TSTR和TSTS。结果表明,合成数据作为训练集可以复制真实数据分布和聚类特征,为合作模型开发和优化提供了巨大潜力,同时不会暴露隐私或安全风险。