文献检索，用中文搜 PubMed

Continuously acquired biosignals from patient monitors contain significant amounts of unusable data. During the development of a decision support system based on continuously acquired biosignals, we developed machine and deep learning algorithms to automatically classify the quality of ECG data. A total of 31,127 twenty-s ECG segments of 250 Hz were used as the training/validation dataset. Data quality was categorized into three classes: acceptable, unacceptable, and uncertain. In the training/validation dataset, 29,606 segments (95%) were in the acceptable class. Two one-step, three-class approaches and two two-step binary sequential approaches were developed using random forest (RF) and two-dimensional convolutional neural network (2D CNN) classifiers. Four approaches were tested on 9779 test samples from another hospital. On the test dataset, the two-step 2D CNN approach showed the best overall accuracy (0.85), and the one-step, three-class 2D CNN approach showed the worst overall accuracy (0.54). The most important parameter, precision in the acceptable class, was greater than 0.9 for all approaches, but recall in the acceptable class was better for the two-step approaches: one-step (0.77) vs. two-step RF (0.89) and one-step (0.51) vs. two-step 2D CNN (0.94) ( < 0.001 for both comparisons). For the ECG quality classification, where substantial data imbalance exists, the 2-step approaches showed more robust performance than the one-step approach. This algorithm can be used as a preprocessing step in artificial intelligence research using continuously acquired biosignals.

A Two-Step Approach to Overcoming Data Imbalance in the Development of an Electrocardiography Data Quality Assessment Algorithm: A Real-World Data Challenge.

作者信息

Kim Hyun Joo, Venkat S Jayakumar, Chang Hyoung Woo, Cho Yang Hyun, Lee Jee Yang, Koo Kyunghee

机构信息

Department of Anesthesiology and Pain Medicine, Anesthesia and Pain Research Institute, Severance Hospital, Yonsei University College of Medicine, Seoul 03722, Republic of Korea.

Department of Thoracic and Cardiovascular Surgery, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Gyeonggi-do, Seongnam-si 13620, Republic of Korea.

出版信息

Biomimetics (Basel). 2023 Mar 13;8(1):119. doi: 10.3390/biomimetics8010119.

从患者监测仪持续获取的生物信号包含大量不可用数据。在基于持续获取的生物信号开发决策支持系统的过程中，我们开发了机器学习和深度学习算法来自动分类心电图（ECG）数据的质量。总共使用了31127个250Hz的20秒ECG片段作为训练/验证数据集。数据质量被分为三类：可接受、不可接受和不确定。在训练/验证数据集中，29606个片段（95%）属于可接受类别。使用随机森林（RF）和二维卷积神经网络（2D CNN）分类器开发了两种一步三类方法和两种两步二进制顺序方法。在来自另一家医院的9779个测试样本上对四种方法进行了测试。在测试数据集上，两步2D CNN方法显示出最佳的总体准确率（0.85），而一步三类2D CNN方法显示出最差的总体准确率（0.54）。最重要的参数，即可接受类别中的精确率，对于所有方法都大于0.9，但可接受类别中的召回率在两步方法中更好：一步RF（0.77）与两步RF（0.89），一步2D CNN（0.51）与两步2D CNN（0.94）（两种比较的P值均<0.001）。对于存在大量数据不平衡的ECG质量分类，两步方法比一步方法表现出更稳健的性能。该算法可作为使用持续获取的生物信号进行人工智能研究的预处理步骤。