Suppr超能文献

生成式模型在合成数据生成中的应用:在药代动力学/药效学数据中的应用。

Generative models for synthetic data generation: application to pharmacokinetic/pharmacodynamic data.

机构信息

School of Computer and Communication Science, Ecole Polytechnique Federale de Lausanne (EPFL), Lausanne, Switzerland.

Merck Quantitative Pharmacology, Ares Trading SA (an affiliate of Merck KGaA, Darmstadt, Germany), Lausanne, Switzerland.

出版信息

J Pharmacokinet Pharmacodyn. 2024 Dec;51(6):877-885. doi: 10.1007/s10928-024-09935-6. Epub 2024 Aug 27.

Abstract

The generation of synthetic patient data that reflect the statistical properties of real data plays a fundamental role in today's world because of its potential to (i) be enable proprietary data access for statistical and research purposes and (ii) increase available data (e.g., in low-density regions-i.e., for patients with under-represented characteristics). Generative methods employ a family of solutions for generating synthetic data. The objective of this research is to benchmark numerous state-of-the-art deep-learning generative methods across different scenarios and clinical datasets comprising patient covariates and several pharmacokinetic/pharmacodynamic endpoints. We did this by implementing various probabilistic models aimed at generating synthetic data, such as the Multi-layer Perceptron Conditioning Generative Adversarial Neural Network (MLP cGAN), Time-series Generative Adversarial Networks (TimeGAN), and a more traditional approach like Probabilistic Autoregressive (PAR). We evaluated their performance by calculating discriminative and predictive scores. Furthermore, we conducted comparisons between the distributions of real and synthetic data using Kolmogorov-Smirnov and Chi-square statistical tests, focusing respectively on covariate and output variables of the models. Lastly, we employed pharmacometrics-related metric to enhance interpretation of our results specific to our investigated scenarios. Results indicate that multi-layer perceptron-based conditional generative adversarial networks (MLP cGAN) exhibit the best overall performance for most of the considered metrics. This work highlights the opportunities to employ synthetic data generation in the field of clinical pharmacology for augmentation and sharing of proprietary data across institutions.

摘要

由于其潜在的作用,生成反映真实数据统计特性的合成患者数据在当今世界中起着至关重要的作用:(i) 能够出于统计和研究目的而访问专有数据,(ii) 增加可用数据(例如,在低密度区域 - 即,对于代表性特征不足的患者)。生成方法采用了一系列解决方案来生成合成数据。本研究的目的是在包含患者协变量和多个药代动力学/药效学终点的不同场景和临床数据集上,对众多最先进的深度学习生成方法进行基准测试。我们通过实现各种旨在生成合成数据的概率模型来实现这一目标,例如多层感知机条件生成对抗神经网络 (MLP cGAN)、时间序列生成对抗网络 (TimeGAN) 以及更传统的方法,如概率自回归 (PAR)。我们通过计算判别和预测分数来评估它们的性能。此外,我们使用 Kolmogorov-Smirnov 和卡方统计检验分别对真实数据和合成数据的分布进行了比较,重点分别放在模型的协变量和输出变量上。最后,我们采用与药物代谢动力学相关的指标来增强我们对特定于所研究场景的结果的解释。结果表明,基于多层感知机的条件生成对抗网络 (MLP cGAN) 在大多数考虑的指标中表现出最佳的整体性能。这项工作强调了在临床药理学领域中使用合成数据生成的机会,以在机构之间增强和共享专有数据。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验