设计并进行技术验证以生成一个合成12导联心电图数据集，以促进人工智能研究。

Design and technical validation to generate a synthetic 12-lead electrocardiogram dataset to promote artificial intelligence research.

作者信息

Yoo Hakje, Moon Jose, Kim Jong-Ho, Joo Hyung Joon

机构信息

Korea University Research Institute for Medical Bigdata Science, Korea University College of Medicine, Seongbuk-gu, Seoul, Republic of Korea.

Department of Bio-Mechatronic Engineering, Sungkyunkwan University College of Biotechnology and Bioengineering, Jangan-gu, Suwon, Gyeonggi Republic of Korea.

出版信息

Health Inf Sci Syst. 2023 Aug 30;11(1):41. doi: 10.1007/s13755-023-00241-y. eCollection 2023 Dec.

DOI:10.1007/s13755-023-00241-y

PMID:37662618

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10468461/

Abstract

PURPOSE

The purpose of this study is to construct a synthetic dataset of ECG signal that overcomes the sensitivity of personal information and the complexity of disclosure policies.

METHODS

The public dataset was constructed by generating synthetic data based on the deep learning model using a convolution neural network (CNN) and bi-directional long short-term memory (Bi-LSTM), and the effectiveness of the dataset was verified by developing classification models for ECG diagnoses.

RESULTS

The synthetic 12-lead ECG dataset generated consists of a total of 6000 ECGs, with normal and 5 abnormal groups. The synthetic ECG signal has a waveform pattern similar to the original ECG signal, the average RMSE between the two signals is 0.042 µV, and the average cosine similarity is 0.993. In addition, five classification models were developed to verify the effect of the synthetic dataset and showed performance similar to that of the model made with the actual dataset. In particular, even when the real dataset was applied as a test set to the classification model trained with the synthetic dataset, the classification performance of all models showed high accuracy (average accuracy 93.41%).

CONCLUSION

The synthetic 12-lead ECG dataset was confirmed to perform similarly to the real-world 12-lead ECG in the classification model. This implies that a synthetic dataset can perform similarly to a real dataset in clinical research using AI. The synthetic dataset generation process in this study provides a way to overcome the medical data disclosure challenges constrained by privacy rights, a way to encourage open data policies, and contribute significantly to promoting cardiovascular disease research.

摘要

目的

本研究的目的是构建一个克服个人信息敏感性和披露政策复杂性的心电图信号合成数据集。

方法

通过使用卷积神经网络（CNN）和双向长短期记忆网络（Bi-LSTM）基于深度学习模型生成合成数据来构建公共数据集，并通过开发用于心电图诊断的分类模型来验证该数据集的有效性。

结果

生成的合成12导联心电图数据集共有6000份心电图，分为正常组和5个异常组。合成心电图信号具有与原始心电图信号相似的波形模式，两个信号之间的平均均方根误差（RMSE）为0.042微伏，平均余弦相似度为0.993。此外，开发了五个分类模型来验证合成数据集的效果，其表现与使用实际数据集构建的模型相似。特别是，当将真实数据集作为测试集应用于用合成数据集训练的分类模型时，所有模型的分类性能都显示出很高的准确性（平均准确率93.41%）。

结论

合成12导联心电图数据集在分类模型中的表现被证实与真实世界的12导联心电图相似。这意味着在使用人工智能的临床研究中，合成数据集可以与真实数据集表现相似。本研究中的合成数据集生成过程提供了一种克服受隐私权限制的医学数据披露挑战的方法，一种鼓励开放数据政策的方法，并为促进心血管疾病研究做出了重大贡献。

相似文献

Design and technical validation to generate a synthetic 12-lead electrocardiogram dataset to promote artificial intelligence research.

Health Inf Sci Syst. 2023 Aug 30;11(1):41. doi: 10.1007/s13755-023-00241-y. eCollection 2023 Dec.

A robust multiple heartbeats classification with weight-based loss based on convolutional neural network and bidirectional long short-term memory.

Front Physiol. 2022 Dec 5;13:982537. doi: 10.3389/fphys.2022.982537. eCollection 2022.

Exploring a new frontier in cardiac diagnosis: ECG analysis enhanced by machine learning and parametric quartic spline modeling.

J Electrocardiol. 2024 Jul-Aug;85:19-24. doi: 10.1016/j.jelectrocard.2024.05.086. Epub 2024 May 21.

Early detection of myocardial ischemia in 12-lead ECG using deterministic learning and ensemble learning.

Comput Methods Programs Biomed. 2022 Nov;226:107124. doi: 10.1016/j.cmpb.2022.107124. Epub 2022 Sep 13.

WaSP-ECG: A Wave Segmentation Pretraining Toolkit for Electrocardiogram Analysis.

Front Physiol. 2022 Mar 17;13:760000. doi: 10.3389/fphys.2022.760000. eCollection 2022.

A Robust Framework for Data Generative and Heart Disease Prediction Based on Efficient Deep Learning Models.

Diagnostics (Basel). 2022 Nov 22;12(12):2899. doi: 10.3390/diagnostics12122899.

Artificial intelligence-assisted remote detection of ST-elevation myocardial infarction using a mini-12-lead electrocardiogram device in prehospital ambulance care.

Front Cardiovasc Med. 2022 Oct 14;9:1001982. doi: 10.3389/fcvm.2022.1001982. eCollection 2022.

CEFEs: A CNN Explainable Framework for ECG Signals.

Artif Intell Med. 2021 May;115:102059. doi: 10.1016/j.artmed.2021.102059. Epub 2021 Mar 26.

A Deep Learning Framework for Automatic Sleep Apnea Classification Based on Empirical Mode Decomposition Derived from Single-Lead Electrocardiogram.

Life (Basel). 2022 Sep 27;12(10):1509. doi: 10.3390/life12101509.

The prediction of cardiac abnormality and enhancement in minority class accuracy from imbalanced ECG signals using modified deep neural network models.

Comput Biol Med. 2022 Nov;150:106142. doi: 10.1016/j.compbiomed.2022.106142. Epub 2022 Sep 22.

引用本文的文献

Synthetic electrocardiograms for Brugada syndrome: from data generation to expert cardiologists evaluation.

Eur Heart J Digit Health. 2025 Apr 24;6(4):683-687. doi: 10.1093/ehjdh/ztaf039. eCollection 2025 Jul.

本文引用的文献

Practical Lessons on 12-Lead ECG Classification: Meta-Analysis of Methods From PhysioNet/Computing in Cardiology Challenge 2020.

Front Physiol. 2022 Jan 14;12:811661. doi: 10.3389/fphys.2021.811661. eCollection 2021.

Data Sharing in Biomedical Sciences: A Systematic Review of Incentives.

Biopreserv Biobank. 2021 Jun;19(3):219-227. doi: 10.1089/bio.2020.0037. Epub 2021 Feb 11.

Deep learning and the electrocardiogram: review of the current state-of-the-art.

Europace. 2021 Aug 6;23(8):1179-1191. doi: 10.1093/europace/euaa377.

Artificial intelligence-enhanced electrocardiography in cardiovascular disease management.

Nat Rev Cardiol. 2021 Jul;18(7):465-478. doi: 10.1038/s41569-020-00503-2. Epub 2021 Feb 1.

The Reconstruction of a 12-Lead Electrocardiogram from a Reduced Lead Set Using a Focus Time-Delay Neural Network.

Acta Cardiol Sin. 2021 Jan;37(1):47-57. doi: 10.6515/ACS.202101_37(1).20200712A.

Classification of 12-lead ECGs: the PhysioNet/Computing in Cardiology Challenge 2020.

Physiol Meas. 2021 Jan 1;41(12):124003. doi: 10.1088/1361-6579/abc960.

Optimization of Skewed Data Using Sampling-Based Preprocessing Approach.

Front Public Health. 2020 Jul 16;8:274. doi: 10.3389/fpubh.2020.00274. eCollection 2020.

Reconstruction of 12-Lead Electrocardiogram from a Three-Lead Patch-Type Device Using a LSTM Network.

Sensors (Basel). 2020 Jun 9;20(11):3278. doi: 10.3390/s20113278.

Wearable Sensing and Telehealth Technology with Potential Applications in the Coronavirus Pandemic.

IEEE Rev Biomed Eng. 2021;14:48-70. doi: 10.1109/RBME.2020.2992838. Epub 2021 Jan 22.

A 12-lead electrocardiogram database for arrhythmia research covering more than 10,000 patients.

Sci Data. 2020 Feb 12;7(1):48. doi: 10.1038/s41597-020-0386-x.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

设计并进行技术验证以生成一个合成12导联心电图数据集，以促进人工智能研究。

Design and technical validation to generate a synthetic 12-lead electrocardiogram dataset to promote artificial intelligence research.

作者信息

Yoo Hakje, Moon Jose, Kim Jong-Ho, Joo Hyung Joon

机构信息

Korea University Research Institute for Medical Bigdata Science, Korea University College of Medicine, Seongbuk-gu, Seoul, Republic of Korea.

Department of Bio-Mechatronic Engineering, Sungkyunkwan University College of Biotechnology and Bioengineering, Jangan-gu, Suwon, Gyeonggi Republic of Korea.

出版信息

Health Inf Sci Syst. 2023 Aug 30;11(1):41. doi: 10.1007/s13755-023-00241-y. eCollection 2023 Dec.

DOI:10.1007/s13755-023-00241-y

PMID:37662618

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10468461/

Abstract

PURPOSE

The purpose of this study is to construct a synthetic dataset of ECG signal that overcomes the sensitivity of personal information and the complexity of disclosure policies.

METHODS

RESULTS

CONCLUSION

摘要

目的

本研究的目的是构建一个克服个人信息敏感性和披露政策复杂性的心电图信号合成数据集。

设计并进行技术验证以生成一个合成12导联心电图数据集，以促进人工智能研究。

Design and technical validation to generate a synthetic 12-lead electrocardiogram dataset to promote artificial intelligence research.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

设计并进行技术验证以生成一个合成12导联心电图数据集，以促进人工智能研究。

Design and technical validation to generate a synthetic 12-lead electrocardiogram dataset to promote artificial intelligence research.

作者信息

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献