Department of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA.
Department of Interactive Computing, Georgia Institute of Technology, Atlanta, Georgia, USA.
J Am Med Inform Assoc. 2023 Jun 20;30(7):1266-1273. doi: 10.1093/jamia/ocad067.
To design and validate a novel deep generative model for seismocardiogram (SCG) dataset augmentation. SCG is a noninvasively acquired cardiomechanical signal used in a wide range of cardivascular monitoring tasks; however, these approaches are limited due to the scarcity of SCG data.
A deep generative model based on transformer neural networks is proposed to enable SCG dataset augmentation with control over features such as aortic opening (AO), aortic closing (AC), and participant-specific morphology. We compared the generated SCG beats to real human beats using various distribution distance metrics, notably Sliced-Wasserstein Distance (SWD). The benefits of dataset augmentation using the proposed model for other machine learning tasks were also explored.
Experimental results showed smaller distribution distances for all metrics between the synthetically generated set of SCG and a test set of human SCG, compared to distances from an animal dataset (1.14× SWD), Gaussian noise (2.5× SWD), or other comparison sets of data. The input and output features also showed minimal error (95% limits of agreement for pre-ejection period [PEP] and left ventricular ejection time [LVET] timings are 0.03 ± 3.81 ms and -0.28 ± 6.08 ms, respectively). Experimental results for data augmentation for a PEP estimation task showed 3.3% accuracy improvement on an average for every 10% augmentation (ratio of synthetic data to real data).
The model is thus able to generate physiologically diverse, realistic SCG signals with precise control over AO and AC features. This will uniquely enable dataset augmentation for SCG processing and machine learning to overcome data scarcity.
设计并验证一种新颖的地震心音(SCG)数据集扩充的深度生成模型。SCG 是一种非侵入性获取的心脏机械信号,用于广泛的心血管监测任务;然而,由于 SCG 数据的稀缺性,这些方法受到限制。
提出了一种基于变压器神经网络的深度生成模型,能够对 SCG 数据集进行扩充,并且可以控制主动脉开口(AO)、主动脉关闭(AC)和参与者特定形态等特征。我们使用各种分布距离度量标准,特别是 Sliced-Wasserstein Distance(SWD),比较了生成的 SCG 心跳与真实人类心跳。还探讨了使用所提出的模型进行其他机器学习任务的数据扩充的好处。
实验结果表明,与动物数据集(SWD 距离为 1.14 倍)、高斯噪声(SWD 距离为 2.5 倍)或其他数据比较集相比,所有度量标准的 SCG 合成数据集与测试集的分布距离都更小。输入和输出特征也显示出最小的误差(PEP 和左心室射血时间[LVET]定时的预射期[PEP]和左心室射血时间[LVET]定时的 95%一致性限分别为 0.03 ± 3.81 ms 和 -0.28 ± 6.08 ms)。用于 PEP 估计任务的数据扩充的实验结果表明,平均每增加 10%的合成数据,准确性提高了 3.3%。
因此,该模型能够生成具有精确 AO 和 AC 特征控制的生理多样化、真实的 SCG 信号。这将独特地实现 SCG 处理和机器学习的数据扩充,以克服数据稀缺性。